在 excel 中我有一列(实际上是多列,约 30K 行),每个单元格的值看起来像这样
7AA914BC, 898B70FB, 898B70FB, 15DD4C5B, 15DD4C5B, 98D2185E, 898BAC48, 98D2185E, 8CFB1468, 8CFB1468, 98C35520, 98C35520, 98C35520, 98D13F8C, 98D13F8C, 98D13F8C, B04680D5, B04680D5, AB2BD8A8, AB2BD8A8, AB2BD8A8, 898C00B0
CSV 最多可以有 50 个。我想提取相邻单元格中的唯一值,其中唯一值将再次用逗号分隔。
我知道“文本分列”和“删除重复项”功能,但这两个功能都不是我想要的。我一直在寻找这个问题的答案,但无济于事。如果这个网站上有答案,我很抱歉,并希望提供该帖子的链接。
提前致谢!
答案1
要用公式来做到这一点,它就变成了一个非常复杂的 TEXTJOIN 数组公式版本。
=TEXTJOIN(", ",TRUE,IF(MATCH(TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*999+1,999)),TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*999+1,999)),0)=ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1)),TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*999+1,999)),""))
作为数组公式,退出编辑模式时必须使用 Ctrl-Shift-Enter 而不是 Enter 进行确认。
这基本上是在逗号上拆分字符串,为每个部分创建一个数组,然后使用 MATCH 迭代这些数组以确认哪个是第一个,并返回一个值数组(如果是第一个)或""
TEXTJOIN。TEXTJOIN 会忽略""
并仅返回唯一列表。
TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*999+1,999))
创建值数组的部分是否在,
如果有新的动态数组公式,我们可以使用 UNIQUE() 并大大缩短公式:
=TEXTJOIN(", ",TRUE,UNIQUE(TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*999+1,999))))
VBA 更适合这种情况。
将其放入工作簿附带的标准模块中:
Function MyUniqueStr(str As String, delim As String)
Dim dic As Object
Set dic = CreateObject("Scripting.Dictionary")
Dim strArr() As String
strArr = Split(str, delim)
Dim strPart As Variant
For Each strPart In strArr
On Error Resume Next
dic.Add Trim(strPart), Trim(strPart)
On Error GoTo 0
Next strPart
Dim temp As String
temp = ""
Dim key As Variant
For Each key In dic
temp = temp & key & delim
Next key
MyUniqueStr = Left(temp, Len(temp) - Len(delim))
End Function
然后就可以像普通公式一样使用它:
=MyUniqueStr(A1,", ")
答案2
或者尝试这个更短的、非数组的并且不需要 CSE 公式的解决方案:
=TEXTJOIN(", ",1,INDEX(FILTERXML("<a><b>"&SUBSTITUTE(A1,", ","</b><b>")&"</b></a>","//b[not(preceding::*=.)]"),0))
TEXTJOIN 是 Office 365 中提供的一项新功能
答案3
对于仅有公式的解决方案,例如bosco_yip
,UNIQUE()
需要...
所以:
=TEXTJOIN(",",TRUE,
UNIQUE(
FILTERXML("<Group><Elements>"&
SUBSTITUTE(A1, ", ", "</Elements><Elements>")&
"</Elements></Group>","/Group/Elements")
)
)
如果没有它,则只会得到原始内容,只是分隔符是一个裸逗号,而不是逗号空格。
答案4
无需脚本。此公式假定单元格中的最后一个值不以逗号结尾,并且在原始内容中找不到“~”字符。
=mid(substitute(concatenate(unique(split(substitute(A1&",",", ",",~"),"~",FALSE),TRUE)),",",", "),1,len(substitute(concatenate(unique(split(substitute(A1&",",", ",",~"),"~",FALSE),TRUE)),",",", "))-2)