Remove lines whose first two fioelds are the same

Dear All,

I want to remove one of the lines if first two fields are the same. For example I have the following file myfile.txt:
:
THERMO
300.0 1500.0 5000.0
Ar                      Ar  1               G000300.00 005000.00 01000.00      1
000000002.5000 000000000.0000 000000000.0000 000000000.0000 000000000.0000     2
-00000745.3750 000000004.3660 000000002.5000 000000000.0000 000000000.0000     3
000000000.0000 000000000.0000 -00000745.3750 000000004.3660                    4
N2                      N   2               G000300.00 005000.00 01000.00      1
000000003.2987 000000000.0014 -00000000.0000 000000000.0000 -00000000.0000     2
-00001020.8999 000000003.9504 000000002.9266 000000000.0015 -00000000.0000     3
H                       H   1               G000200.00 003500.00 01000.00      1
000025473.6599 -00000000.4467 000000002.5000 -00000000.0000 000000000.0000     3
-00000000.0000 000000000.0000 000025473.6599 -00000000.4467                    4
H2                      H   2               G000200.00 003500.00 01000.00      1
000000002.3443 000000000.0080 -00000000.0000 000000000.0000 -00000000.0000     2
-00000917.9352 000000000.6830 000000003.3373 -00000000.0000 000000000.0000     3
O                       O   1               G000200.00 003500.00 01000.00      1
000000003.1683 -00000000.0033 000000000.0000 -00000000.0000 000000000.0000     2
000029122.2592 000000002.0519 000000002.5694 -00000000.0001 000000000.0000     3
O2                      O   2               G000200.00 003500.00 01000.00      1
000000003.7825 -00000000.0030 000000000.0000 -00000000.0000 000000000.0000     2
-00001063.9436 000000003.6577 000000003.2825 000000000.0015 -00000000.0000     3
OH                      H   1               G000200.00 003500.00 01000.00      1
000000003.9920 -00000000.0024 000000000.0000 -00000000.0000 000000000.0000     2
000003615.0806 -00000000.1039 000000003.0929 000000000.0005 000000000.0000     3
H2O                     H   2               G000200.00 003500.00 01000.00      1
000000004.1986 -00000000.0020 000000000.0000 -00000000.0000 000000000.0000     2
-00030293.7267 -00000000.8490 000000003.0340 000000000.0022 -00000000.0000     3
HO2                     H   1               G000200.00 003500.00 01000.00      1
000000004.3018 -00000000.0047 000000000.0000 -00000000.0000 000000000.0000     2
000000294.8080 000000003.7167 000000004.0172 000000000.0022 -00000000.0000     3
H2O2                    H   2               G000200.00 003500.00 01000.00      1
000000004.2761 -00000000.0005 000000000.0000 -00000000.0000 000000000.0000     2
-00017702.5821 000000003.4351 000000004.1650 000000000.0049 -00000000.0000     3
END

I used the code:
awk '!_[$1]++' myfile.txt

and got:
THERMO
300.0 1500.0 5000.0
Ar                      Ar  1               G000300.00 005000.00 01000.00      1
000000002.5000 000000000.0000 000000000.0000 000000000.0000 000000000.0000     2
-00000745.3750 000000004.3660 000000002.5000 000000000.0000 000000000.0000     3
000000000.0000 000000000.0000 -00000745.3750 000000004.3660                    4
N2                      N   2               G000300.00 005000.00 01000.00      1
000000003.2987 000000000.0014 -00000000.0000 000000000.0000 -00000000.0000     2
-00001020.8999 000000003.9504 000000002.9266 000000000.0015 -00000000.0000     3
H                       H   1               G000200.00 003500.00 01000.00      1
000025473.6599 -00000000.4467 000000002.5000 -00000000.0000 000000000.0000     3
-00000000.0000 000000000.0000 000025473.6599 -00000000.4467                    4
H2                      H   2               G000200.00 003500.00 01000.00      1
000000002.3443 000000000.0080 -00000000.0000 000000000.0000 -00000000.0000     2
-00000917.9352 000000000.6830 000000003.3373 -00000000.0000 000000000.0000     3
O                       O   1               G000200.00 003500.00 01000.00      1
000000003.1683 -00000000.0033 000000000.0000 -00000000.0000 000000000.0000     2
000029122.2592 000000002.0519 000000002.5694 -00000000.0001 000000000.0000     3
O2                      O   2               G000200.00 003500.00 01000.00      1
000000003.7825 -00000000.0030 000000000.0000 -00000000.0000 000000000.0000     2
-00001063.9436 000000003.6577 000000003.2825 000000000.0015 -00000000.0000     3
OH                      H   1               G000200.00 003500.00 01000.00      1
000000003.9920 -00000000.0024 000000000.0000 -00000000.0000 000000000.0000     2
000003615.0806 -00000000.1039 000000003.0929 000000000.0005 000000000.0000     3
H2O                     H   2               G000200.00 003500.00 01000.00      1
000000004.1986 -00000000.0020 000000000.0000 -00000000.0000 000000000.0000     2
-00030293.7267 -00000000.8490 000000003.0340 000000000.0022 -00000000.0000     3
HO2                     H   1               G000200.00 003500.00 01000.00      1
000000004.3018 -00000000.0047 000000000.0000 -00000000.0000 000000000.0000     2
000000294.8080 000000003.7167 000000004.0172 000000000.0022 -00000000.0000     3
H2O2                    H   2               G000200.00 003500.00 01000.00      1
000000004.2761 -00000000.0005 000000000.0000 -00000000.0000 000000000.0000     2
-00017702.5821 000000003.4351 000000004.1650 000000000.0049 -00000000.0000     3
END




It is not giving me what I want. It is only considering only the first field, I want it to only delete if the first 2 fields are the same.

Kind regards
Sign In or Register to comment.