Se você deseja procurar caracteres não-ASCII, talvez seja necessário inverter a pesquisa para excluir caracteres ASCII:
grep -Pn '[^\x00-\x7F]'
Por exemplo:
$ curl https://help.ubuntu.com/16.04/installation-guide/amd64/install.en.txt -s | grep -nP '[^\x00-\x7F]' | head
9:Appendix F, GNU General Public License.
14:(codename "‘Xenial Xerus’"), for the 64-bit PC ("amd64") architecture. It also
18:━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
330:when things go wrong. The Installation Howto can be found in Appendix A,
337:Chapter 1. Welcome to Ubuntu
359:1.1. What is Ubuntu?
368: • Ubuntu will always be free of charge, and there is no extra fee for the "
372: • Ubuntu includes the very best in translations and accessibility
376: • Ubuntu is shipped in stable and regular release cycles; a new release will
380: • Ubuntu is entirely committed to the principles of open source software
Nas linhas 9, 330, 337 e 359, caracteres de espaço não quebráveis Unicode estão presentes.
A saída específica que você obtém talvez seja devido ao suporte de grep
para o UTF-8. Para uma localidade Unicode, alguns desses caracteres podem comparar a um caractere ASCII normal. Forçar a localidade C mostrará os resultados esperados nesse caso:
$ LANG=C grep -Pn '[\x80-\xFF]' install.en.txt| head
9:Appendix F, GNU General Public License.
14:(codename "‘Xenial Xerus’"), for the 64-bit PC ("amd64") architecture. It also
18:━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
330:when things go wrong. The Installation Howto can be found in Appendix A,
337:Chapter 1. Welcome to Ubuntu
359:1.1. What is Ubuntu?
368: • Ubuntu will always be free of charge, and there is no extra fee for the "
372: • Ubuntu includes the very best in translations and accessibility
376: • Ubuntu is shipped in stable and regular release cycles; a new release will
380: • Ubuntu is entirely committed to the principles of open source software
$ LANG=en_GB.UTF-8 grep -Pn '[\x80-\xFF]' install.en.txt| head
9:Appendix F, GNU General Public License.
330:when things go wrong. The Installation Howto can be found in Appendix A,
337:Chapter 1. Welcome to Ubuntu
359:1.1. What is Ubuntu?
394:1.1.1. Sponsorship by Canonical
402:1.2. What is Debian?
456:1.2.1. Ubuntu and Debian
461:1.2.1.1. Package selection
475:1.2.1.2. Releases
501:1.2.1.3. Development community