How to count occurrences of text in a file? The Next CEO of Stack OverflowCount duplicated words in a text fileuniq --count command is yields incorrect result?How to compare two (vague) file lists and print the duplicates?How to count occurrences of each character?How do I count text lines?How long it will take to sort uniq a 62GB file?How does uniq work?Count number of data points in fileWrong sorting a text fileHow to count number of partial occurrences of a string in a file

Is it okay to majorly distort historical facts while writing a fiction story?

is it ok to reduce charging current for li ion 18650 battery?

How to invert MapIndexed on a ragged structure? How to construct a tree from rules?

How to get from Geneva Airport to Metabief?

Would this house-rule that treats advantage as a +1 to the roll instead (and disadvantage as -1) and allows them to stack be balanced?

Inappropriate reference requests from Journal reviewers

What is the difference between 翼 and 翅膀?

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

Received an invoice from my ex-employer billing me for training; how to handle?

What happened in Rome, when the western empire "fell"?

What was the first Unix version to run on a microcomputer?

How did people program for Consoles with multiple CPUs?

Rotate a column

Make solar eclipses exceedingly rare, but still have new moons

Chain wire methods together in Lightning Web Components

unclear about Dynamic Binding

Is it possible to replace duplicates of a character with one character using tr

Why isn't acceleration always zero whenever velocity is zero, such as the moment a ball bounces off a wall?

Circle x^2 + y^2 = n! doesn't hit any lattice points for any n except for 0, 1, 2 and 6 or does it?

Newlines in BSD sed vs gsed

Can we say or write : "No, it'sn't"?

Why, when going from special to general relativity, do we just replace partial derivatives with covariant derivatives?

Is a distribution that is normal, but highly skewed considered Gaussian?

Why do remote US companies require working in the US?



How to count occurrences of text in a file?



The Next CEO of Stack OverflowCount duplicated words in a text fileuniq --count command is yields incorrect result?How to compare two (vague) file lists and print the duplicates?How to count occurrences of each character?How do I count text lines?How long it will take to sort uniq a 62GB file?How does uniq work?Count number of data points in fileWrong sorting a text fileHow to count number of partial occurrences of a string in a file










16

















I have a log file sorted by IP addresses,
I want to find the number of occurrences of each unique IP address.
How can I do this with bash? Possibly listing the number of occurrences next to an ip, such as:



5.135.134.16 count: 5
13.57.220.172: count 30
18.206.226 count:2


and so on.



Here’s a sample of the log:



5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:56 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:06 -0400] "POST /wp-login.php HTTP/1.1" 200 3985 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:08 -0400] "POST /wp-login.php HTTP/1.1" 200 3833 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:09 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:11 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:12 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:15 -0400] "POST /wp-login.php HTTP/1.1" 200 3837 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:17 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] "GET / HTTP/1.1" 200 25160 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"









share|improve this question
























  • With “bash”, do you mean the plain shell or the command line in general?

    – dessert
    yesterday











  • Do you have any database software available to use?

    – SpacePhoenix
    yesterday











  • Related

    – Julien Lopez
    20 hours ago
















16

















I have a log file sorted by IP addresses,
I want to find the number of occurrences of each unique IP address.
How can I do this with bash? Possibly listing the number of occurrences next to an ip, such as:



5.135.134.16 count: 5
13.57.220.172: count 30
18.206.226 count:2


and so on.



Here’s a sample of the log:



5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:56 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:06 -0400] "POST /wp-login.php HTTP/1.1" 200 3985 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:08 -0400] "POST /wp-login.php HTTP/1.1" 200 3833 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:09 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:11 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:12 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:15 -0400] "POST /wp-login.php HTTP/1.1" 200 3837 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:17 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] "GET / HTTP/1.1" 200 25160 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"









share|improve this question
























  • With “bash”, do you mean the plain shell or the command line in general?

    – dessert
    yesterday











  • Do you have any database software available to use?

    – SpacePhoenix
    yesterday











  • Related

    – Julien Lopez
    20 hours ago














16












16








16


5








I have a log file sorted by IP addresses,
I want to find the number of occurrences of each unique IP address.
How can I do this with bash? Possibly listing the number of occurrences next to an ip, such as:



5.135.134.16 count: 5
13.57.220.172: count 30
18.206.226 count:2


and so on.



Here’s a sample of the log:



5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:56 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:06 -0400] "POST /wp-login.php HTTP/1.1" 200 3985 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:08 -0400] "POST /wp-login.php HTTP/1.1" 200 3833 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:09 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:11 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:12 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:15 -0400] "POST /wp-login.php HTTP/1.1" 200 3837 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:17 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] "GET / HTTP/1.1" 200 25160 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"









share|improve this question


















I have a log file sorted by IP addresses,
I want to find the number of occurrences of each unique IP address.
How can I do this with bash? Possibly listing the number of occurrences next to an ip, such as:



5.135.134.16 count: 5
13.57.220.172: count 30
18.206.226 count:2


and so on.



Here’s a sample of the log:



5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:55 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
5.135.134.16 - - [23/Mar/2019:08:42:56 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:06 -0400] "POST /wp-login.php HTTP/1.1" 200 3985 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:08 -0400] "POST /wp-login.php HTTP/1.1" 200 3833 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:09 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:11 -0400] "POST /wp-login.php HTTP/1.1" 200 3836 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:12 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:15 -0400] "POST /wp-login.php HTTP/1.1" 200 3837 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.220.172 - - [23/Mar/2019:11:01:17 -0400] "POST /xmlrpc.php HTTP/1.1" 200 413 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] "GET / HTTP/1.1" 200 25160 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] "POST /wp-login.php HTTP/1.1" 200 3988 "https://www.google.com/url?3a622303df89920683e4421b2cf28977" "Mozilla/5.0 (Windows NT 6.2; rv:33.0) Gecko/20100101 Firefox/33.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] "GET /wp-login.php HTTP/1.1" 200 2988 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"






command-line bash sort uniq






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited yesterday









dessert

25.3k673107




25.3k673107










asked yesterday









j0hj0h

6,5521657120




6,5521657120












  • With “bash”, do you mean the plain shell or the command line in general?

    – dessert
    yesterday











  • Do you have any database software available to use?

    – SpacePhoenix
    yesterday











  • Related

    – Julien Lopez
    20 hours ago


















  • With “bash”, do you mean the plain shell or the command line in general?

    – dessert
    yesterday











  • Do you have any database software available to use?

    – SpacePhoenix
    yesterday











  • Related

    – Julien Lopez
    20 hours ago

















With “bash”, do you mean the plain shell or the command line in general?

– dessert
yesterday





With “bash”, do you mean the plain shell or the command line in general?

– dessert
yesterday













Do you have any database software available to use?

– SpacePhoenix
yesterday





Do you have any database software available to use?

– SpacePhoenix
yesterday













Related

– Julien Lopez
20 hours ago






Related

– Julien Lopez
20 hours ago











7 Answers
7






active

oldest

votes


















13














You can use grep and uniq for the list of addresses, loop over them and grep again for the count:



for i in $(<log grep -o '^[^ ]*' | uniq); do
printf '%s count %dn' "$i" $(<log grep -c "$i")
done


grep -o '^[^ ]*' outputs every character from the beginning (^) until the first space of each line, uniq removes repeated lines, thus leaving you with a list of IP addresses. Thanks to command substitution, the for loop loops over this list printing the currently processed IP followed by “ count ” and the count. The latter is computed by grep -c, which counts the number of lines with at least one match.



Example run



$ for i in $(<log grep -o '^[^ ]*'|uniq);do printf '%s count %dn' "$i" $(<log grep -c "$i");done
5.135.134.16 count 5
13.57.220.172 count 9
13.57.233.99 count 1
18.206.226.75 count 2
18.213.10.181 count 3





share|improve this answer




















  • 10





    This solution iterates over the input file repeatedly, once for each IP address, which will be very slow if the file is large. The other solutions using uniq -c or awk only need to read the file once,

    – David
    yesterday






  • 1





    @David this is true, but this would have been my first go at it as well, knowing that grep counts. Unless performance is measurably a problem... dont prematurely optimize?

    – D. Ben Knoble
    yesterday






  • 3





    I would not call it a premature optimization, given that the more efficient solution is also simpler, but to each their own.

    – David
    yesterday


















33














You can use cut and uniq tools:



cut -d ' ' -f1 test.txt | uniq -c
5 5.135.134.16
9 13.57.220.172
1 13.57.233.99
2 18.206.226.75
3 18.213.10.181


Explanation :




  • cut -d ' ' -f1 : extract first field (ip address)


  • uniq -c : report repeated lines and display the number of occurences





share|improve this answer










New contributor




Mikael Flora is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 6





    One could use sed, e.g. sed -E 's/ *(S*) *(S*)/2 count: 1/' to get the output exactly like OP wanted.

    – dessert
    yesterday






  • 1





    This should be the accepted answer, as the one by dessert needs to read the file repeatedly so is much slower. And you can easily use sort file | cut .... in case you're not sure if the file is already sorted.

    – Guntram Blohm
    yesterday


















13














If you don't specifically require the given output format, then I would recommend the already posted cut + uniq based answer



If you really need the given output format, a single-pass way to do it in Awk would be



awk 'c[$1]++ ENDfor(i in c) print i, "count: " c[i]' log


This is somewhat non-ideal when the input is already sorted since it unnecessarily stores all the IPs into memory - a better, though more complicated, way to do it in the pre-sorted case (more directly equivalent to uniq -c) would be:



awk '
NR==1 last=$1
$1 != last print last, "count: " c[last]; last = $1
c[$1]++
END print last, "count: " c[last]
'


Ex.



$ awk 'NR==1 last=$1 $1 != last print last, "count: " c[last]; last = $1 c[$1]++ ENDprint last, "count: " c[last]' log
5.135.134.16 count: 5
13.57.220.172 count: 9
13.57.233.99 count: 1
18.206.226.75 count: 2
18.213.10.181 count: 3





share|improve this answer

























  • it would be easy to change the cut + uniq based answer with sed to appear in the demanded format.

    – Peter A. Schneider
    yesterday











  • @PeterA.Schneider yes it would - I believe that was already pointed out in comments to that answer

    – steeldriver
    yesterday












  • Ah, yes, I see.

    – Peter A. Schneider
    yesterday


















8














Here is one possible solution:





IN_FILE="file.log"
for IP in $(awk 'print $1' "$IN_FILE" | sort -u)
do
echo -en "$IPtcount: "
grep -c "$IP" "$IN_FILE"
done


  • replace file.log with the actual file name.

  • the command substitution expression $(awk 'print $1' "$IN_FILE" | sort -u) will provide a list of the unique values of the first column.

  • then grep -c will count each of these values within the file.


$ IN_FILE="file.log"; for IP in $(awk 'print $1' "$IN_FILE" | sort -u); do echo -en "$IPtcount: "; grep -c "$IP" "$IN_FILE"; done
13.57.220.172 count: 9
13.57.233.99 count: 1
18.206.226.75 count: 2
18.213.10.181 count: 3
5.135.134.16 count: 5





share|improve this answer




















  • 1





    Prefer printf...

    – D. Ben Knoble
    yesterday











  • This means you need to process the entire file multiple times. Once to get the list of IPs and then once more for each of the IPs you find.

    – terdon
    yesterday


















4














Some Perl:



$ perl -lae '$k$F[0]++; } print "$_ count: $k$_" for keys(%k)' log 
13.57.233.99 count: 1
18.206.226.75 count: 2
13.57.220.172 count: 9
5.135.134.16 count: 5
18.213.10.181 count: 3


This is the same idea as Steeldriver's awk approach, but in Perl. The -a causes perl to automatically split each input line into the array @F, whose first element (the IP) is $F[0]. So, $k$F[0]++ will create the hash %k, whose keys are the IPs and whose values are the number of times each IP was seen. The improve this answer




























    4














    Some Perl:



    $ perl -lae '$k$F[0]++; print "$_ count: $k$_" for keys(%k)' log 
    13.57.233.99 count: 1
    18.206.226.75 count: 2
    13.57.220.172 count: 9
    5.135.134.16 count: 5
    18.213.10.181 count: 3


    This is the same idea as Steeldriver's awk approach, but in Perl. The -a causes perl to automatically split each input line into the array @F, whose first element (the IP) is $F[0]. So, $k$F[0]++ will create the hash %k, whose keys are the IPs and whose values are the number of times each IP was seen. The improve this answer































      Some Perl:



      $ perl -lae '$k$F[0]++; print "$_ count: $k$_" for keys(%k)' log 
      13.57.233.99 count: 1
      18.206.226.75 count: 2
      13.57.220.172 count: 9
      5.135.134.16 count: 5
      18.213.10.181 count: 3


      This is the same idea as Steeldriver's awk approach, but in Perl. The -a causes perl to automatically split each input line into the array @F, whose first element (the IP) is $F[0]. So, $k$F[0]++ will create the hash %k, whose keys are the IPs and whose values are the number of times each IP was seen. The { is funky perlspeak for "do the rest at the very end, after processing all input". So, at the end, the script will iterate over the keys of the hash and print the current key ($_) along with its value ($k$_).



      And, just so people don't think that perl forces you to write script that look like cryptic scribblings, this is the same thing in a less condensed form:



      perl -e '
      while (my $line=<STDIN>)
      @fields = split(/ /, $line);
      $ip = $fields[0];
      $counts$ip++;

      foreach $ip (keys(%counts))
      print "$ip count: $counts$ipn"
      ' < log






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered yesterday









      terdonterdon

      67.4k13139222




      67.4k13139222





















          3














          Maybe this is not what the OP want; however, if we know that the IP address length will be limited to 15 characters, a quicker way to display the counts with unique IPs from a huge log file can be achieved using uniq command alone:



          $ uniq -w 15 -c log

          5 5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] ...
          9 13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] ...
          1 13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] ...
          2 18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] ...
          3 18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] ...


          Options:



          -w N compares no more than N characters in lines



          -c will prefix lines by the number of occurrences






          share|improve this answer








          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.




















          • Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

            – Martin Thornton
            21 hours ago
















          3














          Maybe this is not what the OP want; however, if we know that the IP address length will be limited to 15 characters, a quicker way to display the counts with unique IPs from a huge log file can be achieved using uniq command alone:



          $ uniq -w 15 -c log

          5 5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] ...
          9 13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] ...
          1 13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] ...
          2 18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] ...
          3 18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] ...


          Options:



          -w N compares no more than N characters in lines



          -c will prefix lines by the number of occurrences






          share|improve this answer








          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.




















          • Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

            – Martin Thornton
            21 hours ago














          3












          3








          3







          Maybe this is not what the OP want; however, if we know that the IP address length will be limited to 15 characters, a quicker way to display the counts with unique IPs from a huge log file can be achieved using uniq command alone:



          $ uniq -w 15 -c log

          5 5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] ...
          9 13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] ...
          1 13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] ...
          2 18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] ...
          3 18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] ...


          Options:



          -w N compares no more than N characters in lines



          -c will prefix lines by the number of occurrences






          share|improve this answer








          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.










          Maybe this is not what the OP want; however, if we know that the IP address length will be limited to 15 characters, a quicker way to display the counts with unique IPs from a huge log file can be achieved using uniq command alone:



          $ uniq -w 15 -c log

          5 5.135.134.16 - - [23/Mar/2019:08:42:54 -0400] ...
          9 13.57.220.172 - - [23/Mar/2019:11:01:05 -0400] ...
          1 13.57.233.99 - - [23/Mar/2019:04:17:45 -0400] ...
          2 18.206.226.75 - - [23/Mar/2019:21:58:07 -0400] ...
          3 18.213.10.181 - - [23/Mar/2019:14:45:42 -0400] ...


          Options:



          -w N compares no more than N characters in lines



          -c will prefix lines by the number of occurrences







          share|improve this answer








          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|improve this answer



          share|improve this answer






          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered yesterday









          Y. PradhanY. Pradhan

          311




          311




          New contributor




          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Y. Pradhan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.












          • Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

            – Martin Thornton
            21 hours ago


















          • Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

            – Martin Thornton
            21 hours ago

















          Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

          – Martin Thornton
          21 hours ago






          Likely good enough in practice, but worth noting the corner cases. Only 6 probably constant characters after the IP ` - - [`. But in theory the address could be up to 8 characters shorter than the maximum so a change of date could split the count for such an IP. And as you hint, this won't work for IPv6.

          – Martin Thornton
          21 hours ago












          0














          Cut -f1 -d- my.log | sort | uniq -c


          Explanation: Take the first field of my.log splitting on ‘-‘ and sort it. Uniq needs sorted input. ‘-c’ tells it to count occurrences.






          share|improve this answer








          New contributor




          PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.
























            0














            Cut -f1 -d- my.log | sort | uniq -c


            Explanation: Take the first field of my.log splitting on ‘-‘ and sort it. Uniq needs sorted input. ‘-c’ tells it to count occurrences.






            share|improve this answer








            New contributor




            PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






















              0












              0








              0







              Cut -f1 -d- my.log | sort | uniq -c


              Explanation: Take the first field of my.log splitting on ‘-‘ and sort it. Uniq needs sorted input. ‘-c’ tells it to count occurrences.






              share|improve this answer








              New contributor




              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.










              Cut -f1 -d- my.log | sort | uniq -c


              Explanation: Take the first field of my.log splitting on ‘-‘ and sort it. Uniq needs sorted input. ‘-c’ tells it to count occurrences.







              share|improve this answer








              New contributor




              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              share|improve this answer



              share|improve this answer






              New contributor




              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              answered 2 hours ago









              PhDPhD

              101




              101




              New contributor




              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.





              New contributor





              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              PhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.



























                  draft saved

                  draft discarded
















































                  Thanks for contributing an answer to Ask Ubuntu!


                  • Please be sure to answer the question. Provide details and share your research!

                  But avoid


                  • Asking for help, clarification, or responding to other answers.

                  • Making statements based on opinion; back them up with references or personal experience.

                  To learn more, see our tips on writing great answers.




                  draft saved


                  draft discarded














                  StackExchange.ready(
                  function ()
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1129521%2fhow-to-count-occurrences-of-text-in-a-file%23new-answer', 'question_page');

                  );

                  Post as a guest















                  Required, but never shown





















































                  Required, but never shown














                  Required, but never shown












                  Required, but never shown







                  Required, but never shown

































                  Required, but never shown














                  Required, but never shown












                  Required, but never shown







                  Required, but never shown







                  Popular posts from this blog

                  Romeo and Juliet ContentsCharactersSynopsisSourcesDate and textThemes and motifsCriticism and interpretationLegacyScene by sceneSee alsoNotes and referencesSourcesExternal linksNavigation menu"Consumer Price Index (estimate) 1800–"10.2307/28710160037-3222287101610.1093/res/II.5.31910.2307/45967845967810.2307/2869925286992510.1525/jams.1982.35.3.03a00050"Dada Masilo: South African dancer who breaks the rules"10.1093/res/os-XV.57.1610.2307/28680942868094"Sweet Sorrow: Mann-Korman's Romeo and Juliet Closes Sept. 5 at MN's Ordway"the original10.2307/45957745957710.1017/CCOL0521570476.009"Ram Leela box office collections hit massive Rs 100 crore, pulverises prediction"Archived"Broadway Revival of Romeo and Juliet, Starring Orlando Bloom and Condola Rashad, Will Close Dec. 8"Archived10.1075/jhp.7.1.04hon"Wherefore art thou, Romeo? To make us laugh at Navy Pier"the original10.1093/gmo/9781561592630.article.O006772"Ram-leela Review Roundup: Critics Hail Film as Best Adaptation of Romeo and Juliet"Archived10.2307/31946310047-77293194631"Romeo and Juliet get Twitter treatment""Juliet's Nurse by Lois Leveen""Romeo and Juliet: Orlando Bloom's Broadway Debut Released in Theaters for Valentine's Day"Archived"Romeo and Juliet Has No Balcony"10.1093/gmo/9781561592630.article.O00778110.2307/2867423286742310.1076/enst.82.2.115.959510.1080/00138380601042675"A plague o' both your houses: error in GCSE exam paper forces apology""Juliet of the Five O'Clock Shadow, and Other Wonders"10.2307/33912430027-4321339124310.2307/28487440038-7134284874410.2307/29123140149-661129123144728341M"Weekender Guide: Shakespeare on The Drive""balcony"UK public library membership"romeo"UK public library membership10.1017/CCOL9780521844291"Post-Zionist Critique on Israel and the Palestinians Part III: Popular Culture"10.2307/25379071533-86140377-919X2537907"Capulets and Montagues: UK exam board admit mixing names up in Romeo and Juliet paper"Istoria Novellamente Ritrovata di Due Nobili Amanti2027/mdp.390150822329610820-750X"GCSE exam error: Board accidentally rewrites Shakespeare"10.2307/29176390149-66112917639"Exam board apologises after error in English GCSE paper which confused characters in Shakespeare's Romeo and Juliet""From Mariotto and Ganozza to Romeo and Guilietta: Metamorphoses of a Renaissance Tale"10.2307/37323537323510.2307/2867455286745510.2307/28678912867891"10 Questions for Taylor Swift"10.2307/28680922868092"Haymarket Theatre""The Zeffirelli Way: Revealing Talk by Florentine Director""Michael Smuin: 1938-2007 / Prolific dance director had showy career"The Life and Art of Edwin BoothRomeo and JulietRomeo and JulietRomeo and JulietRomeo and JulietEasy Read Romeo and JulietRomeo and Julieteeecb12003684p(data)4099369-3n8211610759dbe00d-a9e2-41a3-b2c1-977dd692899302814385X313670221313670221

                  Creating closest line along the point''s azimuth using PostgreSQL Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Drawing line between points at specific distance in PostGIS?How to efficiently find the closest point over the dateline?How to find the nearest point by using PostGIS function?PostGIS nearest point with LATERAL JOIN in PostgreSQL 9.3+Creating a table and inserting selected streets using plpgsql functionsCreating a table that stores Distances and other columnSaving select query results (year wise) from PostgreSQL/PostGIS to text filesWhat is the information behind this geometry?How to give start and end vertex ids dynamically in pgr_dijkstra?Point to Polygon nearest distance DS_distance is not using geography index & knn <-> or <#> does not give result in orderLine to point conversion with start point and end point detection?

                  Crop image to path created in TikZ? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Crop an inserted image?TikZ pictures does not appear in posterImage behind and beyond crop marks?Tikz picture as large as possible on A4 PageTransparency vs image compression dilemmaHow to crop background from image automatically?Image does not cropTikzexternal capturing crop marks when externalizing pgfplots?How to include image path that contains a dollar signCrop image with left size given