NLP: tr and sort - Unix Commands to extract and manipulates word - corpus

  tr -sc 'A-Za-z' '\n' < ./wizard_of_oz | sort | uniq -c


you should replace ./wizard_of_oz by the file you want to analyze


change lower by upper case letters:

tr -sc 'A-Za-z' '\n' < ./wizard_of_oz |tr a-z A-Z | sort | uniq -c

order numerically the occurrences:

tr -sc 'A-Za-z' '\n' < ./wizard_of_oz |tr a-z A-Z | sort | uniq -c | sort -n -r



Comentarios

Entradas populares de este blog

[MACHINE LEARNING] Un breve ejemplo de uso de JupyterLab

[RUST][BOTS][TELEGRAM] Como crear y explotar un bot de Telefram en un canal de Telegram

[Idiomas][Italiano] Rutina Semanal de Estudio de Italiano (3 horas/semana)