claxon

Commit Graph

Author	SHA1	Message	Date
lif	7b8f1095d4	WIP hacks to enable no_std and no alloc	2020-11-28 02:11:23 -08:00
Ruud van Asseldonk	2f053855d5	Bump version to 0.4.3	2020-08-09 12:05:40 +02:00
Ruud van Asseldonk	f4d2444d4c	Fix typo in doc comment	2020-08-09 11:46:32 +02:00
AnthonyMikh	eb8fabc9f5	Remove `FusedIterator` impl Make code compile on older versions of Rust	2019-09-21 19:52:41 +03:00
AnthonyMikh	d609245055	Use pre-1.30.0 absolute path	2019-09-21 19:48:13 +03:00
AnthonyMikh	d1eb90d8d8	Implement ExactSizeIterator and FusedIterator for Tags Forward `size_hint` to inner iterator	2019-09-21 19:32:35 +03:00
Ruud van Asseldonk	33d19ae9a5	Fix integer underflow when reading Vorbis comments When skipping a comment, the total number of expected comments goes down. But it should not fall below zero. This issue was benign: in the case of wraparound, we would still read to the end of the block as usual, and no further. The 'comments_len' variable was only used after that to report a format error in case of a mismatch between how many comments we read, and how many the header said there would be. The preallocation for the comments vector has already happened at that point, so this could not cause a large allocation. This issue was caught on CI by LibFuzzer.	2019-09-16 20:26:42 +02:00
har0ke	e7602c24c0	Ignore empty vorbis comments	2019-09-11 22:19:10 +02:00
Ruud van Asseldonk	53663d3987	Add test for empty Vorbis comment This file was produced by modifying the double Vorbis comment test sample in a hex editor. The second comment (7 bytes) was just long enough to fit one length prefix (4 bytes) and one "X=Y" comment. While this example file is artificial, I did receive a real-world file where this happens in a bug report. The test currently fails, the fix will be in a separate commit.	2019-09-11 22:19:10 +02:00
Ruud van Asseldonk	e0a45dca1e	Bump version to 0.4.2	2019-05-19 20:45:13 +02:00
Ruud van Asseldonk	e23821f6c8	Prepare changelog for v0.4.2 and v0.3.3	2019-05-19 20:10:55 +02:00
Ruud van Asseldonk	aaaf40bcbf	Try fuzzing on Travis once more They now have Xenial, it might just work. Note after rebase: it does work, enabling this.	2019-05-19 19:37:52 +02:00
Ruud van Asseldonk	1a91a7022c	Test on Rust 1.33 stable on CI, restore nightly	2019-05-19 19:36:37 +02:00
Ruud van Asseldonk	6ef0554436	Prepare changelog for v0.4.2	2019-05-19 19:35:01 +02:00
Ruud van Asseldonk	b0fe361e70	Add test for predict_lpc_high_order This is the same data as in the non_subset test sample. That sample has been verified against the reference decoder, so the expected output here must be the correct output. This is more of a test that will catch bugs early, and pinpoint the cause of a difference, if I ever change the high-order LPC prediction code, than a test that improves coverage -- we already have the test sample for that.	2019-05-19 19:25:34 +02:00
Ruud van Asseldonk	a5fb6de507	Record LPC order fix in changelog	2019-05-11 21:03:35 +02:00
Ruud van Asseldonk	7b29c76675	Make test sample decodable after surgery The test sample was constructed by identifying the start and end byte of the frame that caused the decode to fail, dd-ing that frame out, then dd-ing out the header of the file too, up to the end of the STREAMINFO block, and concatenating those. (I first removed all other metadata blocks with metaflac to ensure that the last block flag is set correctly in the STREAMINFO block.) That results in a file that is decodable by Claxon, but it suffers from two issues: * The MD5 signature of the audio data in the STREAMINFO block is now wrong, so libflac fails the decode. * The number of samples in the STREAMINFO block is now wrong. I fixed both of these, by zeroing out the MD5 signature, and by setting the correct length, in a hex editor. Even better would be to compute the correct MD5 signature, but frankly I never reverse-engineered what it is a signature of. The spec says "MD5 signature of the unencoded audio data", but there is no such thing as "unencoded" audio. For example, in Claxon you always get i32 samples, but often the bit depth will be 16 bits per sample, and the upper bits are zero. Whether you include those zero bits matters a lot for the signature! And then still, are samples of channels interleaved? And in what order? In what endianness? So it is not clear to me what bytes to take the MD5 of. I searched for an existing tool that can replace the signature with one that it computes over the data, but I found no such thing, and flac itself reencodes the audio data too. So for now, just zeroing the signature will have to do.	2019-05-11 19:59:45 +02:00
Ruud van Asseldonk	6402919d7d	Add test sample for non-subset file This is a single frame extracted from a larger file that could not be decoded before implementing support for LPC > 12. We deliberately include only a single frame to make the original audio unrecognizable. The header still includes the md5 checksum of the audio data of the full file, so when decoding with libflac, checksum verification fails. I should still fix the checksum in the file to make the test pass.	2019-05-11 13:45:19 +02:00
Ruud van Asseldonk	bdf9b30c23	Make tests compile after function rename	2019-05-08 22:53:45 +02:00
Ruud van Asseldonk	731058d19b	Remove support check from LPC prediction All valid LPC orders are now supported, no need to return a Result any more.	2019-05-08 22:42:23 +02:00
Ruud van Asseldonk	37f4ee8d86	Add support for LPC order > 12 Flac files which are not subset-compliant may contain LPC orders greater than 12. I had never observed such a file in the wild, so I optimized for subset files, and this optimization relies on the upper bound. But we can still provide a fallback for high LPC orders.	2019-05-08 22:14:52 +02:00
Ruud van Asseldonk	5dc1811e36	Test on Rust 1.30 stable on Travis	2019-01-10 19:45:31 +01:00
Ruud van Asseldonk	3eb45b2887	Work around temporary Rust distribution breakage Temporarily disable builds that use the nightly toolchain, as it cannot be installed at the moment due to [1]. This issue will sort itself out soon, after which this commit can be reverted. It's a bit of a coincidence that the Travis cron build happened to run today, of all days; had it run tomorrow, then I probably would not have noticed a thing. [1]: https://github.com/rust-lang/rust/issues/57488	2019-01-10 19:42:23 +01:00
jnqnfe	1795137bbb	Fix typos in docs	2018-10-20 10:41:28 +02:00
Ruud van Asseldonk	fec24678c7	Zero newly allocated memory in decode buffer Previously, the buffer was allocated to the right size, but the memory was not zeroed, because the entire buffer will be overwritten anyway, so zeroing is a waste. But in the presence of bugs, this can lead uninitialized memory to be exposed. I did some careful benchmarking, and the performance impact of zeroing is negligible for the common case where the buffer can be recycled, and the block size is stable. That means that only initially the buffer needs to be resized, and subsequent blocks all reuse the buffer without any resizing (not in capacity, nor in vec len). Because of that, zeroing is very uncommon, so its expense is not that important, because it is a cold code path. I changed the benchmarks to report the raw decode times per sample (averaged over a block still), and decoded 5 distinct files each 50 times. This made for a total of 1094220 timing measurements. With R we can do a t-test to see the impact of zeroing. In the first case, we compare the typical case: recycling the buffer. > t.test(nozero_recycle, zero_recycle) Welch Two Sample t-test data: nozero_recycle and zero_recycle t = -2.8154, df = 2186400, p-value = 0.004872 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.006822184 -0.001222073 sample estimates: mean of x mean of y 14.12020 14.12422 > mean(zero_recycle) / mean(nozero_recycle) [1] 1.000285 Values are decode time per sample in nanoseconds, so on my system Claxon can do a sample in about 14 ns. As we can see the impact of zeroing is slight, but significant. In the second case, we truncate the decode buffer to 10 samples before decoding a block. This is a worst-case scenario for zeroing, because it has to zero out practically the entire buffer, and it can't be smart about requesting zeroed pages from the OS, because the first 10 samples have to be preserved. Truncating the buffer should not negatively affect the no-zeroing code, because the capacity is still sufficient. > t.test(nozero_truncate, zero_truncate) Welch Two Sample t-test data: nozero_truncate and zero_truncate t = -23.217, df = 2188400, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.02758650 -0.02329142 sample estimates: mean of x mean of y 14.10286 14.12830 > mean(zero_truncate) / mean(nozero_truncate) [1] 1.001804 Here we can see a larger impact of about a 0.2% increase in decode time. That's somewhat painful, given that many of the optimizations in Claxon were wins of a similar order of magnitude. However, keep in mind that this is for an atypical case. As a sanity check, we should see no significant difference in decode times between nozero_truncate and nozero_recycle, but that is not the case, nozero_truncate is significantly faster: > t.test(nozero_truncate, nozero_recycle) Welch Two Sample t-test data: nozero_truncate and nozero_recycle t = -13.487, df = 2036300, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.01985537 -0.01481676 sample estimates: mean of x mean of y 14.10286 14.12020 One difference is that in the truncate case, we mostly hit the code path that does an unsafe Vec::set_len(), whereas the recycling case fails the check for "buffer.len() < new_len" when the buffer has exactly the right size already, we call Vec::truncate() instead. Digging into the Vec::truncate source code reveals that it is implemented as a loop that drops the values. I have not confirmed if this loop is optimized away (it should be, because i32's don't need to be dropped), but if it is not, that might explain the difference. In any case, I think the overhead of zeroing small enough that we can afford it. Thanks to Sergey "Shnatsel" Davidoff for getting this discussion started.	2018-09-17 21:18:11 +02:00
Sergey "Shnatsel" Davidoff	497c4f0827	Make benchmarking script more portable	2018-09-15 14:02:09 +02:00
Ruud van Asseldonk	4607031f98	Use smaller buffer size when fuzzing To trigger the replenishing logic earlier. This improves coverage when the sample size is limited, or conversely, it allows us to get the same coverage with smaller samples, which speeds up fuzzing.	2018-08-31 17:19:22 +02:00
Ruud van Asseldonk	28c06dd106	Add new Claxon-vs-libflac benchmark results After the LLVM upgrade in Rust 1.25.0, rustc started generating better code. This brings Claxon closer to libflac performance, without any effort on my part! I re-ran the tools/benchmark_against_libflac.sh script with a Rust 1.26.0 toolchain, and now Claxon is within 10% of libflac performance, yay! Somebody should try recompiling libflac with a recent Clang then though, it might undo the advantage again. Although Claxon probably benefits more from optimizations than libflac does, because libflac is full of manually vectorized code and low-level optimizations. Claxon's code is far more high-level (still optimized to ensure that the compiler can generate fast code from the high-level code though).	2018-08-31 16:29:33 +02:00
Ruud van Asseldonk	895e62594e	Set codegen-units=1, avoid 45% perf regression Since Rust 1.24.0, the number of codegen units is no longer 1, even for release builds. This speeds up builds slightly, but it reduces the quality of the generated code tremendously. To get decent performance out of Claxon, we need to explicitly pass -C codegen-units=1 as RUSTFLAGS. Update the benchmarks to do so, and add a note in the readme. Thanks to est31 for pointing this out. See also https://github.com/rust-lang/rust/issues/53833.	2018-08-31 16:22:26 +02:00
Ruud van Asseldonk	05455a65b0	Add test to cover all branches of ensure_buffer_len	2018-08-30 20:45:34 +02:00
Ruud van Asseldonk	d94e0efee3	Update fuzz lockfile after v0.4.1 release	2018-08-27 22:39:26 +02:00
Ruud van Asseldonk	b35ab3529e	Silence unused variable warnings in fuzzing cfg	2018-08-27 22:38:27 +02:00
Ruud van Asseldonk	4da87a495c	Use static #[cfg] guard rather than dynamic cfg! This avoids relying on the optimizer to eliminate the branch on a constant value.	2018-08-27 22:33:55 +02:00
Sergey "Shnatsel" Davidoff	a5a9d33605	Do not verify checksums during fuzzing This allows random data from fuzzer reach more code.	2018-08-27 22:28:51 +02:00
Ruud van Asseldonk	e7dae6ee16	Clarify that bug did not affect valid flac inputs This happening by chance on a corrupted file is very unlikely due to flac's checksums. It could be exploited by a maliciously crafted input, but decoding valid flac files worked properly.	2018-08-25 11:48:20 +02:00
Ruud van Asseldonk	139980cf53	Bump version to 0.4.1	2018-08-25 11:43:09 +02:00
Ruud van Asseldonk	4121edb398	Test on Rust 1.24 and 1.27 stable on Travis	2018-08-25 11:40:31 +02:00
Ruud van Asseldonk	507f532656	Clarify affected versions in release notes	2018-08-25 11:39:22 +02:00
Ruud van Asseldonk	c9a2e189ff	Add v0.3.2 to the release notes	2018-08-25 11:37:46 +02:00
Ruud van Asseldonk	a268fcbbf9	Write changelog for v0.4.1	2018-08-25 11:30:15 +02:00
Ruud van Asseldonk	98d28818b0	Merge fix for residual order bug	2018-08-23 23:10:38 +02:00
Ruud van Asseldonk	f42fe0a2cf	Add test cases that reproduce buffer depedency bug Many thanks to Sergey "Shnatsel" Davidoff for providing some of these (the ones with a valid checksum). The smaller ones were found by the fuzzer that I just added, but they don't have valid checksums, so they don't cause the test to fail.	2018-08-23 23:08:30 +02:00
Ruud van Asseldonk	f90ba4dc09	Decode files twice in fuzz regression test This is to reproduce the issue where the contents of the buffer affect the decoding oucome. By decodng twice with different initial buffers, we can discover dependencies on the buffer contents, which should not exist.	2018-08-23 23:01:43 +02:00
Ruud van Asseldonk	8b6b0a744b	Speed up the differential decoder a bit With this, I was able to hit a crash after about three minutes of fuzzing.	2018-08-23 22:40:15 +02:00
Ruud van Asseldonk	3ba427de50	Speed up differential fuzzer	2018-08-23 22:09:50 +02:00
Ruud van Asseldonk	f3fc1800f7	Add differential fuzzer Thanks Sergey "Shnatsel" Davidoff for the idea.	2018-08-23 21:56:35 +02:00
Ruud van Asseldonk	78c3601abe	Mark ensure_buffer_len as unsafe It is very unsafe, because it expects the caller to agree to overwrite all bytes. This makes that a bit clearer. Just doing this does not improve safety though.	2018-08-23 21:22:20 +02:00
Ruud van Asseldonk	8f28ec275e	Fix bug in decoding residuals A partition order could occur, such that the block size was not a multiple of 2^order. Computation of the number of samples per partition did not account for this case, rounding down due to the bit shift. This meant that we would not fill the entire decode buffer. Claxon does not zero the decode buffer because it is (should be) overwritten anyway, and in the case of a format error, where the buffer might be only partially full, the buffer is not exposed again. Furthermore, the way decoding works in most places, is that we fill the entire buffer, just by looping to fill it. If the input bitstream does not contain enough data to fill the buffer, then that's a format error. In a few places though, we need to slice up the buffer before decoding into it: for decoding individual channels, and also for decoding residuals, which are split into partitions. This particular format error was especially nasty because it did not cause a format error down the line. Instead, it caused the buffer to be sliced in a way where the slices together did not cover the entire buffer, and so parts of uninitialized memory could remain in the buffer. Thanks a lot to Sergey "Shnatsel" Davidoff for reporting this bug, together with elaborate steps to reproduce that allowed me to pinpoint the cause quickly.	2018-08-23 20:36:24 +02:00
Ruud van Asseldonk	cd82be35f4	Upgrade fuzzers	2018-04-06 14:12:32 +02:00
Ruud van Asseldonk	15def1eb62	Update code of conduct	2018-03-27 22:57:18 +02:00

1 2 3 4 5 ...

610 Commits All Branches Search

610 Commits

All Branches