Como buscar uma string em CSV, criar um novo CSV pelo nome da string e adicionar essa linha em particular a ela?

3

Este é um exemplo do meu arquivo CSV:

04/Feb/2016:06:38:44-0500,ab,3,10,57,200,10254
04/Feb/2016:06:39:07-0500,cd,1,42,168,304,0
04/Feb/2016:06:39:07-0500,ef,1,43,169,304,0
04/Feb/2016:06:39:07-0500,ab,1,43,170,304,0
04/Feb/2016:06:39:07-0500,cd,1,44,171,304,0
04/Feb/2016:06:39:07-0500,ef,1,45,172,304,0

Eu gostaria de buscar a string na segunda coluna, criar um arquivo chamado assim, se o arquivo não existir e adicionar aquela linha específica em um arquivo. Então, algo assim:

fetch string in 2nd column -> "ab" -> if file doesnt exist create file called "ab.csv" -> open file and add line "04/Feb/2016:06:38:44-0500,ab,3,10,57,200,10254"
fetch string in 2nd column -> "cd" -> if file doesnt exist create file called "cd.csv" -> open file and add line "04/Feb/2016:06:39:07-0500,cd,1,42,168,304,0"
fetch string in 2nd column -> "ef" -> if file doesnt exist create file called "ef.csv" -> open file and add line "04/Feb/2016:06:39:07-0500,ef,1,43,169,304,0"
fetch string in 2nd column -> "ab" -> if file doesnt exist create file called "ab.csv" -> open file and add line "04/Feb/2016:06:39:07-0500,ab,1,43,170,304,0"
fetch string in 2nd column -> "cd" -> if file doesnt exist create file called "cd.csv" -> open file and add line "04/Feb/2016:06:39:07-0500,cd,1,44,171,304,0"
fetch string in 2nd column -> "ef" -> if file doesnt exist create file called "ef.csv" -> open file and add line "04/Feb/2016:06:39:07-0500,ef,1,45,172,304,0"

Resultado:

ab.csv:
04/Feb/2016:06:38:44-0500,ab,3,10,57,200,10254
04/Feb/2016:06:39:07-0500,ab,1,43,170,304,0
----------------------------------------------
cd.csv:
04/Feb/2016:06:39:07-0500,cd,1,42,168,304,0
04/Feb/2016:06:39:07-0500,cd,1,44,171,304,0
----------------------------------------------
ef.csv:
04/Feb/2016:06:39:07-0500,ef,1,43,169,304,0
04/Feb/2016:06:39:07-0500,ef,1,45,172,304,0

Qualquer ajuda appriciated!

    
por vayacondios2015 07.10.2016 / 16:47

2 respostas

5

Usando awk

$ awk -F, '{print >> ".csv"}' file.csv

$ cat ab.csv
04/Feb/2016:06:38:44-0500,ab,3,10,57,200,10254
04/Feb/2016:06:39:07-0500,ab,1,43,170,304,0
$ cat cd.csv
04/Feb/2016:06:39:07-0500,cd,1,42,168,304,0
04/Feb/2016:06:39:07-0500,cd,1,44,171,304,0
$ cat ef.csv
04/Feb/2016:06:39:07-0500,ef,1,43,169,304,0
04/Feb/2016:06:39:07-0500,ef,1,45,172,304,0
$

Tenha em mente que os arquivos CSV reais podem incluir vírgulas citar dentro de seus campos separados por vírgula - portanto, um analisador CSV adequado é sempre recomendado para uso sério : consulte exemplo Como ler um arquivo CSV usando Perl? ou PyMOTW: Arquivos de valores separados por vírgula .

    
por steeldriver 07.10.2016 / 17:03
2

Eu não sei se você ainda está procurando por solução pythonic. Estou impressionado com a simplicidade da resposta da steeldriver, não percebi que o awk é tão poderoso.

#!/usr/bin/env python

import csv
import os

def main():
    with open("file.csv", "rb") as f:
        reader = csv.reader(f)
        for row in reader:
            fname = row[1] + ".csv"
            with open(fname, 'w') as f:
                f.write(','.join([i for i in row]))


main()

Não posso me culpar por tentar: D pela recompensa brilhante

    
por Thu Yein Tun 17.10.2016 / 15:21