从日志文件的所有行中删除不需要的部分

从日志文件的所有行中删除不需要的部分

我有一个来自服务器访问日志的日志文件。它包含所有详细信息,包括时间戳、IP 地址等...但我只需要路径。理想情况下每行一个路径。有什么方法可以实现这一点吗?

当前日志如下:

116.75.106.00 - - [02/Nov/2017:06:26:36 +0000] "GET /get_stores_by_coordinate/12.8888028/77.61058939999998 HTTP/1.1" 200 12 "https://somesite.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
202.00.17.17 - - [02/Nov/2017:07:29:22 +0000] "GET /get_stores_by_coordinate/12.9665752/77.54940080000006 HTTP/1.1" 200 13 "https://somesite.com/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
106.51.00.151 - - [02/Nov/2017:10:17:40 +0000] "GET /get_stores_by_coordinate/12.9184255/77.69706129999997 HTTP/1.1" 200 13 "https://somesite.com/category/all-fresh-vegetables?page=1" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
115.00.14.188 - - [02/Nov/2017:11:21:54 +0000] "GET /get_stores_by_coordinate/12.9535435/77.65611799999999 HTTP/1.1" 200 18 "https://somesite.com/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
106.193.106.135, 107.178.33.24 - - [01/Nov/2017:04:23:17 +0000] "GET /get_stores_by_coordinate/12.8149529/77.69153989999995 HTTP/1.1" 200 13 "https://somesite.com/shocking-deal" "Mozilla/5.0 (Linux; Android 4.2.1; Andi4.7G COBALT Build/JOP40D; en-us) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Mobile Safari/537.36 Puffin/6.1.4.16005AP"
122.00.174.223 - - [01/Nov/2017:05:16:15 +0000] "GET /get_stores_by_coordinate/12.963847/77.71509939999999 HTTP/1.1" 200 13 "https://somesite.com/content/aashirvaad-superior-mp-atta-0?id=342796" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
42.109.00.29 - - [01/Nov/2017:05:22:07 +0000] "GET /get_stores_by_coordinate/12.8599073/77.6129899 HTTP/1.1" 200 13 "https://somesite.com/content/indiras-kids-teens-special-ragi-huri-hittu" "Mozilla/5.0 (Linux; Android 7.0; SM-G955F Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36"
223.00.111.92 - - [01/Nov/2017:06:35:05 +0000] "GET /get_stores_by_coordinate/12.9867109/77.47682529999997 HTTP/1.1" 200 13 "https://somesite.com/shocking-deal" "Mozilla/5.0 (Linux; Android 7.0; SM-G610F Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36"
157.49.600.225 - - [01/Nov/2017:06:42:04 +0000] "GET /get_stores_by_coordinate/12.9250074/77.59380280000005 HTTP/1.1" 200 13 "https://www.somesite.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
49.205.007.213 - - [01/Nov/2017:10:12:02 +0000] "GET /get_stores_by_coordinate/12.9134818/77.67630370000006 HTTP/1.1" 200 13 "https://somesite.com/category/fruits-vegetables?page=2" "Mozilla/5.0 (Linux; Android 7.0; Redmi Note 4 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36"
117.192.00.121 - - [01/Nov/2017:11:11:32 +0000] "GET /get_stores_by_coordinate/12.988344/77.69477940000002 HTTP/1.1" 200 13 "https://somesite.com/category/all-fresh-vegetables" "Mozilla/5.0 (Linux; Android 5.1.1; SM-G920V Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.73 Mobile Safari/537.36"
27.59.00.172 - - [01/Nov/2017:13:48:20 +0000] "GET /get_stores_by_coordinate/12.899623/77.48269759999994 HTTP/1.1" 200 13 "https://somesite.com/veggie-deal-sold-out" "Mozilla/5.0 (Linux; Android 4.4.4; Che1-L04 Build/Che1-L04) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.91 Mobile Safari/537.36"
157.49.8.174 - - [01/Nov/2017:14:30:55 +0000] "GET /get_stores_by_coordinate/12.9178366/77.62893409999992 HTTP/1.1" 200 12 "https://somesite.com/category/fruits-vegetables" "Mozilla/5.0 (Linux; Android 6.0; XT1706 Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.83 Mobile Safari/537.36"
49.205.218.115 - - [01/Nov/2017:16:07:30 +0000] "GET /get_stores_by_coordinate/12.9783116/77.62680449999993 HTTP/1.1" 200 18 "https://somesite.com/shocking-deal" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
223.186.180.16 - - [01/Nov/2017:17:32:48 +0000] "GET /get_stores_by_coordinate/12.916476846940544/77.54625226376072 HTTP/1.1" 200 18 "https://somesite.com/products-search?search_text=" "Mozilla/5.0 (Linux; Android 7.0; Redmi Note 4 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36"
122.167.42.188 - - [01/Nov/2017:18:13:55 +0000] "GET /get_stores_by_coordinate/12.879735912598306/77.58562820058591 HTTP/1.1" 200 13 "https://somesite.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Edge/15.15063"
106.51.109.33 - - [01/Nov/2017:18:52:13 +0000] "GET /get_stores_by_coordinate/12.9886182/77.53828640000006 HTTP/1.1" 200 13 "https://somesite.com/faqs" "Mozilla/5.0 (Linux; Android 7.0; BLN-L22 Build/HONORBLN-L22) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.73 Mobile Safari/537.36"
122.167.179.49 - - [02/Nov/2017:01:35:14 +0000] "GET /get_stores_by_coordinate/12.9911799/77.65920519999997 HTTP/1.1" 200 12 "https://somesite.com/" "Mozilla/5.0 (iPad; CPU OS 11_0_3 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A432 Safari/604.1"
106.51.16.99 - - [02/Nov/2017:02:38:15 +0000] "GET /get_stores_by_coordinate/12.956271173261019/77.70767604007426 HTTP/1.1" 200 13 "https://somesite.com/category/fruits-vegetables" "Mozilla/5.0 (iPhone; CPU iPhone OS 11_0_3 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A432 Safari/604.1"
106.51.8.130 - - [31/Oct/2017:17:02:49 +0000] "GET /get_stores_by_coordinate/12.9611228/77.64724660000002 HTTP/1.1" 200 18 "https://www.somesite.com/cart" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
220.158.131.70 - - [31/Oct/2017:17:31:49 +0000] "GET /get_stores_by_coordinate/12.9590342/77.65173249999998 HTTP/1.1" 200 18 "https://somesite.com/" "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"
27.59.107.121 - - [31/Oct/2017:19:09:49 +0000] "GET /get_stores_by_coordinate/12.9296983/77.55601120000006 HTTP/1.1" 200 13 "https://somesite.com/content/shivling-tur-dal" "Mozilla/5.0 (Linux; Android 7.1.1; ONEPLUS A3003 Build/NMF26F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36"
106.51.29.219 - - [01/Nov/2017:02:54:38 +0000] "GET /get_stores_by_coordinate/12.968688499999999/77.6557613 HTTP/1.1" 200 12 "https://somesite.com/category/fruits-vegetables?page=2" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
106.51.107.3 - - [01/Nov/2017:02:56:44 +0000] "GET /get_stores_by_coordinate/12.9157744/77.58748289999994 HTTP/1.1" 200 13 "https://somesite.com/category/milk" "Mozilla/5.0 (Linux; Android 7.0; SAMSUNG SM-A520F Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/5.4 Chrome/51.0.2704.106 Mobile Safari/537.36"

我需要这样的方式:

/get_stores_by_coordinate/12.8888028/77.61058939999998
/get_stores_by_coordinate/12.8888028/77.61058939999998
/get_stores_by_coordinate/12.8888028/77.61058939999998

答案1

这是基于您的特定日志文件。我正在使用 Sublime Text(有史以来最好的文本编辑器 :) IMHO)

在此处输入图片描述

脚步:

  1. 查找 HTTP/
  2. 单击“查找全部”(Mac)或 Alt+Enter(Windows)
  3. 左箭头 x2
  4. Shift+Alt+左箭头,直到到达“/get_”的开头
  5. Command+C (Mac) 或 Ctrl+C (Windows) 复制
  6. Command+N (Mac) 或 Ctrl+N (Windows) 创建新窗口
  7. Command+V (Mac) 或 Ctrl+V (Windows) 粘贴

答案2

Http 日志文件的格式往往非常具体。您可以设置日志分析器并获取此路径以及更多统计数据。

也就是说,由于它们是相当标准的格式,所以正则表达式也可以做到这一点。

答案3

根据问题的评论。我使用 sublime text 应用程序中的正则表达式解决了这个问题。

首先我使用了:

.+?(?=GET /get_stores_by_coordinate/)

其次我使用了:

\HTTP(.*)

相关内容