用于套接字SOCK_DGRAM
UDP
所有数据包的22 bytes
长度均为(即64 including headers
)
客户端.c
...
no_of_packets--;
sprintf(buf, "#:!0 rem");
sprintf(buf, format , buf);
sprintf(buf_aux, "#: 0 rem");
sprintf(buf_aux, format , buf_aux);
buf[MAX_LINE-1] = '\0';
buf_aux[MAX_LINE-1] = '\0';
len = strlen(buf) + 1;
send(s, buf, len, 0);
while (no_of_packets-- > 1) {
nanosleep(&T, NULL);
send(s, buf, len, 0);
}
send(s, buf_aux, len, 0);
服务器端
...
while(1) {
if (len = recv(s, buf, sizeof(buf), 0)){
// do nothing
}
}
当我打开Wireshark
查看发送的数据包之间的平均延迟时,
我可以看到以下内容:
分钟延迟:0.000 006 795 秒 =>6微秒
最大限度延迟:0.000 260 952 秒 =>260 微秒
但我想每隔512纳秒(即 0.512 微秒)。
如何我能达到这个速度吗?
答案1
纳米睡眠的问题在于它包含相当大的开销。用户往往期望延迟精确到非常低的值,这对于 CPU 产生睡眠例程来说并不实际。期望如下:
actual delay = nanosleep(requested delay)
然而,现实中用户得到的是这样的:
actual delay = nanosleep(requested delay) + the fixed overheads.
在我的测试计算机上,强制 CPU 亲和性(以消除调度程序的影响)并将 CPU 频率缩放调节器设置为性能,固定开销为 52 微秒。
解决您问题的一个可能方法是用延迟循环替换对 nanosleep 的调用。这样 CPU 就不会让步,进入空闲状态,也不会产生操作系统调用开销。当然,延迟循环可能难以校准,无法真正获得所需的延迟。我制作了两个程序,一个使用 nanosleep,一个使用延迟循环进行演示。为了使其他开销可以忽略不计,每次执行时间和打印类型调用时,睡眠和延迟循环都会运行很多次:
作为参考,所使用的程序:
纳米睡眠版本:
/*****************************************************************************
*
* test_slp2.c 2019.09.18 Smythies
* Create data to determine the lower limit of nanosleep.
*
* test_slp.cpp 2017.11.25 Smythies
* Of course, none of this stuff works the way it used to.
*
* test_slp.cpp 2012.01.24 Smythies
* I need to be able to yeild (sleep), but for less than a second.
* Experiment with nanosleep and usleep functions.
*
*****************************************************************************/
// prevent warning about nanosleep
#define _POSIX_C_SOURCE 199309L
#include <sys/time.h>
#include <sys/types.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#define CR 13
unsigned long long stamp(void){
struct timeval tv;
gettimeofday(&tv, NULL);
return (unsigned long long)tv.tv_sec * 1000000 + tv.tv_usec;
} /* endprocedure */
int main(){
unsigned long long start, now;
long i, j, k;
int ns;
struct timespec time;
time.tv_sec = 0;
start = stamp();
for(ns = 500; ns < 100000; ns = ns + 500){
for(j = 0; j < 100000; j++){
time.tv_nsec = ns;
nanosleep(&time, &time);
} /* endfor */
now = stamp();
printf("%d %llu\n", ns, (now - start));
start = now;
} /* endfor */
return(0);
}
而延迟循环版本,包括粗略的校准,如图所示,效果并不好:
/*****************************************************************************
*
* test_slp3.c 2019.09.18 Smythies
* Now, do the same as test_slp2, but use a waste time loop instead of
nanosleep.
*
* test_slp2.c 2019.09.18 Smythies
* Create data to determine the lower limit of nanosleep.
*
* test_slp.cpp 2017.11.25 Smythies
* Of course, none of this stuff works the way it used to.
*
* test_slp.cpp 2012.01.24 Smythies
* I need to be able to yeild (sleep), but for less than a second.
* Experiment with nanosleep and usleep functions.
*
*****************************************************************************/
// prevent warning about nanosleep
#define _POSIX_C_SOURCE 199309L
#include <sys/time.h>
#include <sys/types.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
/* for my test computer */
#define CALIBRATION 26
unsigned long long stamp(void){
struct timeval tv;
gettimeofday(&tv, NULL);
return (unsigned long long)tv.tv_sec * 1000000 + tv.tv_usec;
} /* endprocedure */
int main(){
unsigned long long start, now;
long i, j, k, m, loops;
int ns;
struct timespec time;
time.tv_sec = 0;
start = stamp();
for(ns = 500; ns <= 10000; ns = ns + 100){
loops = ns * CALIBRATION / 50; /* will have rounding issues */
for(m = 0; m < 10000000; m++){ /* this is just to slow things down, so we can test */
for(j = 0; j < loops; j++){
k = j; /* make sure any compile optimizer doesn't take out the loop */
} /* endfor */
} /* endfor */
now = stamp();
printf("%d %llu\n", ns, (now - start));
start = now;
} /* endfor */
return(0);
}
预期问题:为什么延迟循环测试中没有 gettimeofday 抖动,而您(我)声称这是 nanosleep 版本中抖动的原因?答案:可能是因为延迟循环方法每个样本的运行时间比 nanosleep 版本长得多。或者,我错了。
这些测试强制使用 CPU 亲和性。例如:
time taskset -c 3 ./test_slp3
我对整个测试进行了计时,只是为了进行合理性检查,因为各个时间的总和应该等于总时间。他们做到了。
CPU 频率调节调节器设置为性能:
$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance