我正在尝试替换 CSV 文件文本文件中的字符串。行分隔符字符串是{"id":
并且希望在其每次出现之前插入一个新行,因为 CSV 文件仅显示为一行和所有列。
理想情况下,我需要用逗号分隔文件,并将其转换为列,并且在任何地方{"id":"
都会出现 a 来转换为新行,即新行
每列应以 分隔"*":
,其中 * 表示任何文本,例如:"TLP":
或"id":
下面是该文件的示例,示例文本应为每个标签生成 3 行和一列
很抱歉问了一个痛苦的问题,但是我已经尝试了我能想到的 sed 和 awk 的所有组合,但没有任何效果
{"id":"5863ddde2577f521dccd9a3a","name":"Switcher: Android 加入攻击路由器俱乐部","description":"最近,在我们永不停歇地保护世界免受恶意软件侵害的过程中,我们发现了一个行为不当的 Android 木马虽然针对 Android 操作系统的恶意软件早已不再是新鲜事,但该木马非常独特,它不是攻击用户,而是攻击用户连接的 Wi-Fi 网络,或者更准确地说,攻击用户所连接的 Wi-Fi 网络。该木马(称为 Trojan.AndroidOS.Switcher)会对路由器的管理 Web 界面执行暴力密码猜测攻击,如果攻击成功,该恶意软件会更改路由器中的 DNS 服务器地址。设置,从而将受攻击 Wi-Fi 网络中的设备的所有 DNS 查询重新路由到网络犯罪分子的服务器(此类攻击也称为 DNS 劫持)。因此,让我们详细解释 Switcher 如何执行暴力攻击。 ,进入路由器并进行 DNS 劫持。","author_name":"AlienVault","modified":"2016-12-28T15:44:30.187000","created":"2016-12-28T15:44 :30.187000","tags":["android","baidu","android","mobile","dns hijack","Trojan.AndroidOS.Switcher","Kaspersky"],"references":["hxxps ://securelist.com/blog/mobile/76969/switcher-android-joins-the-attack-the-router-club/"],"修订版":1.0,"指标":[{"内容":"" ,"指标":"acdb7bfebf04affd227c93c97df536cf","描述":"","创建":"2016-12-28T15:44:31","is_active":1,"标题":"","access_reason":" ","access_type":"公共","access_groups":[],"角色":null,"过期":null,"类型":"FileHash-MD5","id":1744766,"观察结果":1 },{"内容":"","指标":"64490fbecefa3fcdacd41995887fe510","描述":"","创建":"2016-12-28T15:44:31","is_active":1,"标题" :"","access_reason":"","access_type":"public","access_groups":[],"角色":null,"expiration":null,"type":"FileHash-MD5","id ":1744767,"观察结果":1},{"内容":"","指标":"101.200.147.153","描述":"","创建":"2016-12-28T15:44:31 ","is_active":1,"标题":"","access_reason":"","access_type":"公共","access_groups":[],"角色":null,"过期":null,"类型”:"IPv4","id":1744768,"observations":1},{"content":"","indicator":"112.33.13.11","description":"","created":"2016-12 -28T15:44:31","is_active":1,"title":"","access_reason":"","access_type":"public","access_groups":[],"role":null,"过期":null,"类型":"IPv4","id":1744769,"观察结果":1},{"内容":"","指标":"120.76.249.59","描述":"" ,"创建":"2016-12-28T15:44:31","is_active":1,"标题":"","access_reason":"","access_type":"public","access_groups":[ ],“角色”:null,“过期”:null,“类型”:“IPv4”,“id”:1744770,“观察”:1}],“TLP”:“绿色”,“公共”:true, "adversary":"","targeted_countries":["中国"],"industries":[]},{"id":"585bdcd497316a2db901eaa5","name":"乌克兰野战炮兵部队的花式熊追踪","描述":"2016 年夏末,CrowdStrike Intelligence 分析师开始调查一个名为“Попр-Д30.apk”的奇怪 Android 软件包 (APK),其中包含许多具有军事性质的俄语工件。初步研究发现,该软件包具有军事性质。文件名表明与 D-30 122 毫米牵引榴弹炮有关,这是一种火炮武器,最初于 20 世纪 60 年代在苏联制造,但至今仍在使用。深入的逆向工程显示,该 APK 包含 X-Agent 的 Android 变体,命令和控制协议与观察到的 X-Agent 的 Windows 变体密切相关,并使用了名为 RC4 的加密算法,该算法具有非常相似的 50 字节基本密钥。 ","author_name":"AlienVault","修改":"2016-12-22T14:03:53.674000","创建":"2016-12-22T14:01:56.495000","标签":["apt28" ,“花式熊”,“乌克兰”,“军事”,“X-Agent”,“D-30”,“crowdstrike”],“参考文献”:[“hxxps://www.crowdstrike.com/blog/danger -close-fancy-bear-tracking-ukrainian-field-artillery-units/","hxxps://www.crowdstrike.com/wp-content/brochures/FancyBearTracksUkrainianArtillery.pdf"],"修订版":2.0,"指标":[{"内容":"","指标":"69.90.132.215","描述":"","创建":"2016-12-22T14:01:57","is_active":1, “标题”:“”,“access_reason”:“”,“access_type”:“公共”,“access_groups”:[],“角色”:“command_and_control”,“过期”:null,“类型”:“IPv4”,“id”:1683228,“观察”:1},{“内容”:“”,“指标”:“6f7523d3019fa190499f327211e01fcb” ,"描述":"","创建":"2016-12-22T14:01:57","is_active":1,"标题":"","access_reason":"","access_type":"公共","access_groups":[],"角色":null,"过期":null,"type":"FileHash-MD5","id":1683229,"observations":2}],"TLP":" green","public":true,"adversary":"Fancy Bear","targeted_countries":["乌克兰"],"industries":["defence","military"]},{"id":"585ae32297316a22f301eaa5 ","name":"假冒应用程序利用 Super Mario Run 版本","description":"今年早些时候,我们讨论了网络犯罪分子如何利用 Pokemon Go 的流行来推出他们自己的恶意应用程序。最后,我们观察到任天堂的另一款游戏《超级马里奥》也发生了同样的情况。\n\n在任何正式发布之前,网络犯罪分子就已经发布了他们自己的马里奥相关应用程序。自 2012 年以来,我们发现了 9,000 多个与马里奥相关的应用程序。在各种在线来源上使用马里奥名字的应用程序。其中大约三分之二的应用程序表现出某种恶意行为,包括未经用户同意显示广告和下载应用程序。","author_name":"AlienVault","modified":"2016-12-21T20:16:34.201000" ,"创建":"2016-12-21T20:16:34.201000","标签":["超级马里奥","android","马里奥","任天堂","谷歌游戏","恶意软件","trendmicro "],"参考文献":["hxxp://blog.trendmicro.com/trendlabs-security-intelligence/fake-apps-take-advantage-mario-run-release/"],"修订版":1.0,"指标":[{"内容":"","指标":"8373aedc9819ff5dacb0fc1864eeb96adc5210b2","描述":"","创建":"2016-12-21T20:16:35","is_active":1,"标题":"","access_reason":"","access_type":"public","access_groups":[],"角色":null,"expiration":null,"type":"FileHash-SHA1"," id":1674453,"观察":1},{"内容":"","指标":"4ba312a6eaf79da9036d4228a43f19c611345a5a","描述":"","创建":"2016-12-21T20:16:35" ,"is_active":1,"title":"","access_reason":"",“access_type”:“public”,“access_groups”:[],“角色”:null,“expiration”:null,“type”:“FileHash-SHA1”,“id”:1674454,“observations”:1}] ,"TLP":"绿色","公共":true,"对手":"","targeted_countries":[],"行业":[]}]
答案1
正如其他人评论的那样,该文件是 JSON,而不是 CSV,您确实应该为其使用专用的 JSON 解析器。
如果需要在 JSON 库不可用的地方运行,那么将数组拆分为三个 CSV 格式的行对于脚本来说是一个足够简单的问题sed
:
s/},{"id":/\
"id":/g # remove delimiter between records, insert newline
s/^\[\?{// # remove delimiter before first record
s/}\]\?$// # remove delimiter after last record
答案2
不是真正的答案:只是快速杰克例子:
(添加输入开头缺少的“[”后)
我们可以提取ID,作者姓名, 和TLP值与:
$ jq '.[] | .id + ":" + .author_name + ":" + .TLP' input
"5863ddde2577f521dccd9a3a:AlienVault:green"
"585bdcd497316a2db901eaa5:AlienVault:green"
"585ae32297316a22f301eaa5:AlienVault:green"