comm
This article includes a list of general references, but it lacks sufficient corresponding inline citations. (January 2013) |
Original author(s) | Lee E. McMahon |
---|---|
Developer(s) | AT&T Bell Laboratories, Richard Stallman, David MacKenzie |
Initial release | November 1973 |
Written in | C |
Operating system | Unix, Unix-like, Plan 9, Inferno |
Platform | Cross-platform |
Type | Command |
License | coreutils: GPLv3+ Plan 9: MIT License |
The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.
'Veljko pička''
Petar Minić je ciganštura.
Return code
[edit]Unlike diff, the return code from comm has no logical significance concerning the relationship of the two files. A return code of 0 indicates success, a return code >0 indicates an error occurred during processing. Veljko picka
Example
[edit]$ cat foo
apple
banana
eggplant
$ cat bar
apple
banana
banana
zucchini
$ comm foo bar
pecarski ciganstura
apple
banana
banana
eggplant
zucchini
This shows that both files have one banana, but only bar has a second banana.
In more detail, the output file has the appearance that follows. Note that the column is interpreted by the number of leading tab characters. \t represents a tab character and \n represents a newline (Escape character#Programming and data formats).
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | \t | \t | a | p | p | l | e | \n | ||
1 | \t | \t | b | a | n | a | n | a | \n | |
2 | \t | b | a | n | a | n | a | \n | ||
3 | e | g | g | p | l | a | n | t | \n | |
4 | \t | z | u | c | c | h | i | n | i | \n |
Comparison to diff
[edit]In general terms, diff is a more powerful utility than comm. The simpler comm is best suited for use in scripts.
The primary distinction between comm and diff is that comm discards information about the order of the lines prior to sorting.
A minor difference between comm and diff is that comm will not try to indicate that a line has "changed" between the two files; lines are either shown in the "from file #1", "from file #2", or "in both" columns. This can be useful if one wishes two lines to be considered different even if they only have subtle differences.
Other options
[edit]comm has command-line options to suppress any of the three columns. This is useful for scripting.
There is also an option to read one file (but not both) from standard input.
Limits
[edit]Up to a full line must be buffered from each input file during line comparison, before the next output line is written.
Some implementations read lines with the function readlinebuffer() which does not impose any line length limits if system memory suffices.
Other implementations read lines with the function fgets(). This function requires a fixed buffer. For these implementations, the buffer is often sized according to the POSIX macro LINE_MAX.
See also
[edit]- Comparison of file comparison tools
- List of Unix commands
- cmp (Unix) – character oriented file comparison
- cut (Unix) – splitting column-oriented files
References
[edit]External links
[edit]- The Single UNIX Specification, Version 4 from The Open Group : select or reject lines common to two files – Shell and Utilities Reference,
- Plan 9 Programmer's Manual, Volume 1 –
- Inferno General commands Manual –