Robert Elder Software Inc.
  • Home
  • Store
  • Blog
  • Contact
  • Home
  • Store
  • Blog
  • Contact
  • #linux
  • |
  • #commandline
  • |
  • #softwareengineering
  • |
  • #embeddedsystems
  • |
  • #compilers
  • ...
  • View All >>

Intro To 'comm' Command In Linux

2023-09-06 - By Robert Elder

     I use the 'comm' command to find all of the lines that are common between two files:

comm -12 A.txt B.txt
b
c

Things You Can Do With The 'comm' Command

  • Using 'comm' To Compute Only Set A
  • Using 'comm' To Compute Only Set B
  • Using 'comm' To Compute A \ B (Set Subtraction)
  • Using 'comm' To Compute B \ A (Set Subtraction)
  • Using 'comm' To Compute A ∩ B (Intersection)
  • Using 'comm' To Compute A ∪ B (Union)
  • Using 'comm' To Compute (A ∪ B) ∖ (A ∩ B) (Disjunctive Union)
  • Using 'comm' To Compute ∅ (Empty Set)
  • Understanding The 'comm' Command
  • Avoid Tab Indenting
  • Input Must Be Sorted
comm Command Set Operations

Example Use Cases

     In the next few sections, we'll review some example use cases of the 'comm' command that make use of the following two files: 'plants.txt' and 'foods.txt':

     This file 'plants.txt' contains this list of plants:

Oak Tree
Corn
Poison Ivy
Potato
Wheat
Grass

     and this file, 'foods.txt', contains a list of foods:

Wheat
Corn
Potato
Milk
Fish
Energy Drinks

     NOTE: In the examples below, it is assumed that your input does not contain tab characters.  See the section Avoid Tab Indenting for a special note on this topic.

Using 'comm' To Compute Only Set A

     This 'comm' command will show only the list of plants:

Only Set A
#  All Plants:  Only Set A
comm -2 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Corn
Grass
Oak Tree
Poison Ivy
Potato
Wheat

Using 'comm' To Compute Only Set B

     and this command will show only the list of foods:

Only Set B
#  All Foods:  Only B
comm -1 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Corn
Energy Drinks
Fish
Milk
Potato
Wheat

Using 'comm' To Compute A \ B (Set Subtraction)

     This will show plants that are not foods:

A Minus B
#  Plants that are not foods: (A ∖ B) Set Subtraction
comm -23 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Grass
Oak Tree
Poison Ivy

Using 'comm' To Compute B \ A (Set Subtraction)

     This will show foods that are not plans:

B Minus A
#  Foods that are not plants: (B ∖ A) Set Subtraction
comm -13 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Energy Drinks
Fish
Milk

Using 'comm' To Compute A ∩ B (Intersection)

     This command shows items that are both plants and foods:

Intersection
#  Plants that are also foods: (A ∩ B), Intersection
comm -12 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Corn
Potato
Wheat

Using 'comm' To Compute A ∪ B (Union)

     and this shows items that are plants or foods:

Union
#  Items that are either plants or foods: (A ∪ B), Union
comm <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Corn
Energy Drinks
Fish
Grass
Milk
Oak Tree
Poison Ivy
Potato
Wheat

Using 'comm' To Compute (A ∪ B) ∖ (A ∩ B) (Disjunctive Union)

     This will show all plants or foods that are not both plants and foods:

Disjunctive Union
#  Items that are either plants or, but not both:  (A ∪ B) ∖ (A ∩ B), Disjunctive Union
comm -3 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
Energy Drinks
Fish
Grass
Milk
Oak Tree
Poison Ivy

Using 'comm' To Compute ∅ (Empty Set)

     This will always show the empty set:

Empty Set
#  This will always produce the empty Set, ∅
comm -123 <(sort plants.txt) <(sort foods.txt) | tr -d '\t'
(no output)

Understanding The 'comm' Command

     Each number that's supplied to the comm command corresponds an area in the Venn diagram that will be suppressed from the final output:

Comm Command Columns

     This explains why 'comm -123' will always show no output.  Including '-123' means to suppress all three regions in the Venn diagram.

Avoiding Tab Indenting

     By default, the 'comm' command will tab indent the lines based on the column number that they belong to:

comm <(sort plants.txt) <(sort foods.txt)
		Corn
	Energy Drinks
	Fish
Grass
	Milk
Oak Tree
Poison Ivy
		Potato
		Wheat

     In a previous version of this article, I suggested that you can remove this indentation by specifying the output-delimiter='' flag with an empty delimiter. However, this is not entirely correct!  For example, on my machine, if I specify an empty output, I'll see the following output:

comm --output-delimiter='' <(sort plants.txt) <(sort foods.txt)
Corn
Energy Drinks
Fish
Grass
Milk
Oak Tree
Poison Ivy
Potato
Wheat

     Which looks fine upon cursory inspection.  However, if you pipe this into 'xxd':

comm --output-delimiter='' <(sort plants.txt) <(sort foods.txt) | xxd
00000000: 0000 436f 726e 0a00 456e 6572 6779 2044  ..Corn..Energy D
00000010: 7269 6e6b 730a 0046 6973 680a 4772 6173  rinks..Fish.Gras
00000020: 730a 004d 696c 6b0a 4f61 6b20 5472 6565  s..Milk.Oak Tree
00000030: 0a50 6f69 736f 6e20 4976 790a 0000 506f  .Poison Ivy...Po
00000040: 7461 746f 0a00 0057 6865 6174 0a         tato...Wheat.

     you will notice from the above output, that using --output-delimiter='' doesn't give you an empty string for the delimiter, it uses a single null character instead!

     I am not sure if this should be considered a bug in my own GNU Coreutils v8.30 version of the 'comm' command or not.  The '--output-delimiter' flag does does not appear to be in the POSIX standard for the 'comm' command, so it's not surprising that the behaviour of this flag is a bit less predictable.  I checked the source code in an older version of the GNU comm command, and I believe older versions may even issue an error message if you try to specify an empty output delimiter.

     This doesn't matter much if you're simply printing results to the terminal, but it's a big deal if you're using 'comm' perform some kind of fundamental set operation and you want to send the results to another program (for example, even back into the 'comm' command again)!  I only noticed this issue after someone prompted me to verify (A ∪ B) ∖ (A ∩ B) using the 'comm' command, or in other words, basically this:

diff <(comm -23 <(comm <(sort plants.txt) <(sort foods.txt)) <(comm -12 <(sort plants.txt) <(sort foods.txt))) <(comm -3 <(sort plants.txt) <(sort foods.txt))

     But, predictably, this doesn't produce an empty diff because the 'comm' command adds tabs to some of the lines:

1d0
< 		Corn
8,9d6
< 		Potato
< 		Wheat

     Now, if you add the output-delimiter='', it STILL doesn't compare equally:

diff <(comm --output-delimiter='' -23 <(comm --output-delimiter='' <(sort plants.txt) <(sort foods.txt)) <(comm --output-delimiter='' -12 <(sort plants.txt) <(sort foods.txt))) <(comm --output-delimiter='' -3 <(sort plants.txt) <(sort foods.txt))
comm: file 1 is not in sorted order
comm: input is not in sorted order
Binary files /dev/fd/63 and /dev/fd/62 differ

     The fact that it says 'Binary files differ' it itself a clue, since the input files are purely ASCII text.  If the --output-delimiter flag is omitted entirely, and the output is instead piped through the 'tr' command to delete any tab characters, the result now successfully compares as being identical:

diff <(comm -23 <(comm <(sort plants.txt) <(sort foods.txt) | tr -d '\t') <(comm -12 <(sort plants.txt) <(sort foods.txt) | tr -d '\t') | tr -d '\t') <(comm -3 <(sort plants.txt) <(sort foods.txt) | tr -d '\t')
(no output)

     Based on the above, I would suggest not using the '--output-delimiter' flag to remove the indentation and instead remove the tab characters by piping them through the 'tr' command, as shown in the examples in this article.  Of course, this will not work for you if your input contains tab characters, however this is the best general-purpose solution that I can think of for dealing with this unfortunate corner-case of the 'comm' command.  If you input does contain tab characters, you could try explicitly using a null delimiter and then delete that using 'tr'.

The 'comm' Command Only Works With Sorted Inputs

     The 'comm' command expects both input files to be sorted, and if they're not the output may be incorrect:

comm -12 --output-delimiter='' plants.txt foods.txt
comm: file 1 is not in sorted order
Wheat
comm: file 2 is not in sorted order

     Fortunately, you can use the 'sort' command and the following '<(...)' syntax in bash to redirect a sorted version of the file directly into the 'comm' command like this:

comm -12 <(sort plants.txt) <(sort foods.txt)
Corn
Potato
Wheat

     And that's why the 'comm' command is my favourite Linux command.

Intro To 'stty' Command In Linux
Intro To 'stty' Command In Linux
Published 2023-10-04
Terminal Block Mining Simulation Game
$1.00 CAD
Terminal Block Mining Simulation Game
Intro To 'nproc' Command In Linux
Intro To 'nproc' Command In Linux
Published 2023-07-15
How To Force The 'true' Command To Return 'false'
How To Force The 'true' Command To Return 'false'
Published 2023-07-09
A Surprisingly Common Mistake Involving Wildcards & The Find Command
A Surprisingly Common Mistake Involving Wildcards & The Find Command
Published 2020-01-21
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
Published 2019-08-01
Intro To 'chroot' Command In Linux
Intro To 'chroot' Command In Linux
Published 2023-06-23
Intro To 'sha256sum' Command In Linux
Intro To 'sha256sum' Command In Linux
Published 2023-08-30
Join My Mailing List
Privacy Policy
Why Bother Subscribing?
  • Free Software/Engineering Content. I publish all of my educational content publicly for free so everybody can make use of it.  Why bother signing up for a paid 'course', when you can just sign up for this email list?
  • Read about cool new products that I'm building. How do I make money? Glad you asked!  You'll get some emails with examples of things that I sell.  You might even get some business ideas of your own :)
  • People actually like this email list. I know that sounds crazy, because who actually subscribes to email lists these days, right?  Well, some do, and if you end up not liking it, I give you permission to unsubscribe and mark it as spam.
© 2025 Robert Elder Software Inc.
SocialSocialSocialSocialSocialSocialSocial
Privacy Policy      Store Policies      Terms of Use