AWK、SED 或 GREP 用于从 HTML 文件中提取数据

Question 1

不是超级优雅，但你可以：

sed -ne 's/.*"test-summary".* \([0-9][0-9]* right [^&].*exceptions\)&nbsp.*/\1/p'

例如：

$ echo '<script>document.getElementById("test-summary").innerHTML = "<strong>Test Pages:</strong> 1 right, 0 wrong, 0 ignored, 0 exceptions&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;' | sed -ne 's/.*"test-summary".* \([0-9][0-9]* right,[^&].*exceptions\)&nbsp.*/\1/p'
1 right, 0 wrong, 0 ignored, 0 exceptions

Answer

不是超级优雅，但你可以：

sed -ne 's/.*"test-summary".* \([0-9][0-9]* right [^&].*exceptions\)&nbsp.*/\1/p'

例如：

$ echo '<script>document.getElementById("test-summary").innerHTML = "<strong>Test Pages:</strong> 1 right, 0 wrong, 0 ignored, 0 exceptions&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;' | sed -ne 's/.*"test-summary".* \([0-9][0-9]* right,[^&].*exceptions\)&nbsp.*/\1/p'
1 right, 0 wrong, 0 ignored, 0 exceptions

Question 2

与grep和awk：

grep 'document.getElementById("test-summary")' file.html | awk -F'</strong>|&' '{print $2}'

Answer

与grep和awk：

grep 'document.getElementById("test-summary")' file.html | awk -F'</strong>|&' '{print $2}'

AWK、SED 或 GREP 用于从 HTML 文件中提取数据

答案1

答案2

相关内容