• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Standardization of sampling rate
#1
I am surprised to not read any recommendation about standardization of sampling rates and number of bits per sample.

From my simple understanding, differences in sampling rates should lead to additional effort and artefacts during resampling.

Any hints / comments / experiences ?

Reducing latency by increasing the sampling rate to extreme values like 96 kHz does not make sense to me
  Reply
#2
(07-21-2020, 09:18 PM)MartinB Wrote: I am surprised to not read any recommendation about standardization of sampling rates and number of bits per sample.

From my simple understanding, differences in sampling rates should lead to additional effort and artefacts during resampling.

Any hints / comments / experiences ?

Reducing latency by increasing the sampling rate to extreme values like 96 kHz does not make sense to me
In my experience I just use whatever my interface can handle without any noise or other signal conversion artifacts. The main goal of JK is just being able to play with others without dropouts, noise or excessive latency. Audio fidelity isn't a consideration IMHO.
  Reply
#3
Audio fidelity is important. Very!
If the sound quality was as e.g. 4 bits at 8000hz, so ...

And fortunately, so to speak, the sound quality in JamKazam is alright/good.

The recorded wave files is by the way in 32-bit and 48 kHz.
  Reply
#4
(07-21-2020, 09:18 PM)MartinB Wrote: I am surprised to not read any recommendation about standardization of sampling rates and number of bits per sample.

From my simple understanding, differences in sampling rates should lead to additional effort and artefacts during resampling.

Any hints / comments / experiences ?

Reducing latency by increasing the sampling rate to extreme values like 96 kHz does not make sense to me

A pertinent point Martin,
Sadly, I can not offer a clear answer - just a related experience - but, let's pursue it and we might get some clarity in the process:
  • I connect to J-K with a Zoom H4 i.e. ASIO interface via a USB cable
  • ASIO provides two sampling rates 44.1 and 48 KHz 
  • there is no major or audible quality difference between the rates
  • J-K must have an interface that is either written or adopted from elsewhere to synch and work with the sampling rate
  • in my opinion both rates should work identically
  • however if, 48 KHz rate works better it must mean a better (more stable) code is written by J-K
  • so here is my question to J-K users: which sampling rates works better and why i.e. clearer sound, less drop outs, etc? 

  • The second point is 'Audio Frames'
  • in File/Audio Properties/Audio Boost/ it is possible to adjust 'Audio Frame' size from 20 to 10 to 5 and even to 1 (in mSec)
  • a general notion is that shorter (smaller) audio frame makes for a better connection, sound, etc.
  • can somebody explain the math here...?!
    1. how is a shorter (smaller) audio frame relate to audio quality and the sampling rate?
    2. what is the relation$p of audio frames and the sampling rate
    3. if one adjusts audio frame in the J-K Audio Properties, does one have to adjust it in the ASIO device as well?

  • none of this should be difficult and yet with the J-K this is a Pandoras box!
    Cheers, Lex
  Reply
#5
This is my simple picture of the process:
- The audio interface repeatedly measures the loudness and describes it with a number of x bits. The more bits you use, the better you can describe the loudness range. The single measurement is called a sample
- The number of samples taken per second (sampling rate) determines. how good you can capture the frequencies in the music. According to a Nyquist-Shannon theorem the sampling rate has to be at least twice the highest frequency you can hear (which is about 20 kHz).
- Different standards for bits per sample and sampling rate exist, e.g. CDs use 16 bit and 44.1 kHz, in other common areas of applications 24 or 32 bit and 48 kHz are used.
- For the transmission the samples are send in packages, i.e. they are buffered in frames. The size of the frame can be described either by the number of samples per frame or by the length of the frame in milli seconds (ms). A frame with 100 samples at a sampling rate of 48 kHz has a length of (100 / 48) ms, i.e. about 2 ms
- The amount of information transmitted per second is about independent of the frame size and (without compression) given by number of bits per sample times number of samples per second. 24 bit samples at 48 kHz give 24 bit * 48 kHz, i.e. a little more than 1 Mbit/s or roughly 0,15 Mbyte/s per channel (has to be e.g. doubled for stereo)
- Lowering the frame size obviously reduces latency, but increases the chance of packets not arriving in the correct order.

My question did not refer to the individual setting of the audio interface, but to potential problems arriving when the members of a group use different settings. The resampling required when different sampling rates are used may be even more probematic with short frame sizes.

48 kHz to me seems to be a reasonable choice for a standard sampling rate.
  Reply
#6
(07-21-2020, 09:18 PM)MartinB Wrote: I am surprised to not read any recommendation about standardization of sampling rates and number of bits per sample.

From my simple understanding, differences in sampling rates should lead to additional effort and artefacts during resampling.

Any hints / comments / experiences ?

Reducing latency by increasing the sampling rate to extreme values like 96 kHz does not make sense to me

TLDR summary: you're right.  It doesn't make sense.  However, the people doing it did have a good reason.

Here's my understanding.  (As a real-time software engineer and home recordist for 40 years, I'm a certified nerd.  Most of what I say is correct, even!)

First, the latency we're measuring here is audio interface round-trip latency, and has nothing to do with the network.  The network is important for JK, but it's not what we see as our latency when we measure it in JK.  (I'm new here, and perhaps there's a way to show latency for each member in a jam, and THAT would include network latency.)

But restricting ourselves to audio interface latency here, the main cause of this latency is buffering: storing sample points in a buffer that is passed between the software and hardware.  If my sample rate is 44100 samples per second and my buffer is 44100 samples long, then my latency going from hardware to software (audio input) is going to be 1 second.  (Yeah, that's an oversimplification.)  And it'll be another second going from software to hardware on output, for a total "round-trip" latency (in my computer) of 2 seconds.

There are two ways to reduce this buffering latency.  The first is to use smaller buffers.  The second is to use a higher sample rate.

Generally the best way to reduce latency is to use smaller buffers.  There's a limit to that related to software engineering/architecture.  But if the buffer size is set to some constant by something, then the only other option is to increase the sample rate.  However, you're right that this is a nutty idea in this context.  Fortunately, not quite as nutty as one might think, due to how JK software works.

I'm new here, so some of this may be wrong, but my understanding is that JK software "normalizes" the audio from whatever your soundcard produces, and it uses the same audio format for everyone.  That's a good thing.  Now, if we knew what format it likes best and used that in our computer, it would make JK job easier.  I believe it uses 48000 kHz.  It apparently uses floating point internally, so 16 vs 24 bit samples won't matter much; either way it's pretty much the same conversion.  And it compresses the audio (like MP3) so that it doesn't use too much bandwidth.  I have no guess what compression scheme it uses, and don't really care.

Note that there's a limit to how low the buffering latency can be, related to how long the computer's attention might be on other stuff it can't interrupt to handle your audio.  Usually the biggest pole in that tent is "PCI bus latency" and you can google for apps to measure that.  IIRC a popular old one is "pcilat.exe" if you're on Windows.  When you try to go lower, you get pops and clicks due to the CPU being busy with other stuff when it needed to handle your audio buffer.  It doesn't hurt anything other than your ears, so it's fine to try.  Also, for jamming, an occasional pop or click may be better than always having too much latency.  It's a tradeoff that you get some control over.

Finally (and hopefully) the real reason that people use higher sample rates is to pass the audio setup test; if we don't pass, we don't get to play!  That test uses a fixed buffer size! (440 samples!) So, we cheat to pass the exam.

But wait!  I lied.  It doesn't really use a fixed value.  It uses a fixed DEFAULT value.  I never would have guessed, but while that "add audio equipment" panel that tests our hardware is up, the "Manage" button on the panel behind is still active!  And we can use that to reduce the buffer size, all the way down to 1ms (it uses time rather than number of samples, which is a good idea.)  So, the main reason for using higher sample rates disappears, poof.

If that higher sample rate propagated all the way through the network, it would be really bad.  Fortunately it doesn't.  But it still adds a lot of unnecessary processing for your CPU, and you don't even get the benefit of better audio quality (because JK normalizes it to its preferred format, and extra quality is silently discarded.)

Regarding artefacts, I'd wager that the compression method's artefacts would overwhelm resampling artefacts.  You get bonus points for spelling it that way in this context.  I had to add it to my spell checker just now.
  Reply
#7
You made two assumptions that surprised me:

Description of sample amplitude by floating point numbers: Somebody said that the JamKazam files use a 48 kHz sampling rate and 32 bit samples. I have never before, however, found a hint that JamKazam uses floating point numbers to describe the amplitude of the samples. In any case I do not expect too many problems in converting from 16 or 24 bit samples (or standard 32 bit samples)

Resampling by JamKazam: I understand that you assume that JamKazam not only somehow compresses the information for transmission but also somehow solves the problem of different sampling rates. It would be really interesting to know more about this magic.
  Reply
#8
You're right.  I assumed 32-bit floating point and not 32-bit fixed point, because from a practical standpoint, 32-bit floating point is very useful and 32-bit fixed point is nearly useless.  When you see a .WAV file that someone says is 32-bit PCM, chances are good that it's 32-bit floating point and not fixed point.  But I didn't make a recording to verify.

Regarding resampling, JK has to resample, in many cases.  It can only send audio to your soundcard at one sample rate.  So, the question is, does it resample on the sending end or receiving end?  There are pros and cons but my instinct would be to use a standard rate in the protocol, and then at each endpoint, resample as needed.  The big benefit to that is that at the receiving end, you can sum first and resample once, rather than resample each input stream and then sum.  In that case, there would be a preferred sample rate, the one used by the protocol.

But you're right; it seems that if there is a standard rate in the protocol, it should be encouraged.  And if there isn't, one should be encouraged as an ad-hoc standard.
  Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)