在 Excel 中合并/整合具有几乎相同数据的列

在 Excel 中合并/整合具有几乎相同数据的列

给出下表(其中包括世界上所有国家,但为了方便起见,此处被截断):

在此处输入图片描述

我的最终目标是绘制散点图反对价值观. 问题是 ColumnsA包含几乎相同的数据,但是:

  1. 将会有国家A不在,反之亦然。
  2. 对于一些国家A, 柱子不包含任何数据,这些国家需要被忽略。
  3. 少数国家/地区的名称不同,例如 A 列中的名称为“波多黎各”,而 B 列中的名称为“波多黎各 (美国)”

Excel 中是否有内置功能来处理此类事情,还是需要先进行一些手动操作?

答案1

您可以尝试使用 soundex 函数的这个实现来比较这两列:

Function Soundex(Surname As String) As String
' Developed by Richard J. Yanco
' This function follows the Soundex rules given at
' http://home.utah-inter.net/kinsearch/Soundex.html

    Dim Result As String, c As String * 1
    Dim Location As Integer

    Surname = UCase(Surname)
    If Surname = "" Then
        Soundex = ""
        Exit Function
    End If

'   First character must be a letter
    If Asc(Left(Surname, 1)) < 65 Or Asc(Left(Surname, 1)) > 90 Then
        Soundex = ""
        Exit Function
    Else
'       St. is converted to Saint
        If Left(Surname, 3) = "ST." Then
            Surname = "SAINT" & Mid(Surname, 4)
        End If

'       Convert to Soundex: letters to their appropriate digit,
'                     A,E,I,O,U,Y ("slash letters") to slashes
'                     H,W, and everything else to zero-length string

        Result = Left(Surname, 1)
        For Location = 2 To Len(Surname)
            Result = Result & SoundexCategory(Mid(Surname, Location, 1))
        Next Location

'       Remove double letters
        Location = 2
        Do While Location < Len(Result)
            If Mid(Result, Location, 1) = Mid(Result, Location + 1, 1) Then
                Result = Left(Result, Location) & Mid(Result, Location + 2)
            Else
                Location = Location + 1
            End If
        Loop

'       If SoundexCategory of 1st letter equals 2nd character, remove 2nd character
        If SoundexCategory(Left(Result, 1)) = Mid(Result, 2, 1) Then
            Result = Left(Result, 1) & Mid(Result, 3)
        End If

'       Remove slashes
        For Location = 2 To Len(Result)
            If Mid(Result, Location, 1) = "/" Then
                Result = Left(Result, Location - 1) & Mid(Result, Location + 1)
            End If
        Next

'       Trim or pad with zeroes as necessary
        Select Case Len(Result)
            Case 4
                Soundex = Result
            Case Is < 4
                Soundex = Result & String(4 - Len(Result), "0")
            Case Is > 4
                Soundex = Left(Result, 4)
        End Select
    End If
End Function

Private Function SoundexCategory(c) As String
'   Returns a Soundex code for a letter
    Select Case True
        Case c Like "[AEIOUY]"
            SoundexCategory = "/"
        Case c Like "[BPFV]"
            SoundexCategory = "1"
        Case c Like "[CSKGJQXZ]"
            SoundexCategory = "2"
        Case c Like "[DT]"
            SoundexCategory = "3"
        Case c = "L"
            SoundexCategory = "4"
        Case c Like "[MN]"
            SoundexCategory = "5"
        Case c = "R"
            SoundexCategory = "6"
        Case Else 'This includes H and W, spaces, punctuation, etc.
            SoundexCategory = ""
    End Select
End Function

然后只需比较 =soundex(a1)=soundex(d1),结果对于波多黎各和波多黎各 (美国) 来说都是正确的,从那里您可以根据这个比较进行过滤。

相关内容