unix formatação de comando achatando dados de objetos json aninhados

1

exemplo de formato de dados de entrada json

data: {
   div1: {
      name: "some name",
      age: number,
      address_1: "some address",
      items: {
         item_x1: "some data",
         ..
         ..
      }
   }
   ..
   ..
}

o resultado esperado deve ser formatado achatado json

{ "data.div1.name":"some name",..., "data.div1.items.item_x1":"some data",...},
..
..
{ "data.divN.name":"some name",... }

os campos podem ser desconhecidos! então não é necessário ativar nenhum comando de filtragem!

alguma idéia para comando baseado em unix?

    
por mr.tee 04.05.2018 / 12:05

2 respostas

2

Dê uma olhada em gron . Na página vinculada:

Make JSON greppable!

gron transforms JSON into discrete assignments to make it easier to grep for what you want and see the absolute 'path' to it. It eases the exploration of APIs that return large blobs of JSON but have terrible documentation.

▶ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author"
json[0].commit.author = {};
json[0].commit.author.date = "2016-07-02T10:51:21Z";
json[0].commit.author.email = "[email protected]";
json[0].commit.author.name = "Tom Hudson";

gron can work backwards too, enabling you to turn your filtered data back into JSON:

▶ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author" | gron --ungron
[
  {
    "commit": {
      "author": {
        "date": "2016-07-02T10:51:21Z",
        "email": "[email protected]",
        "name": "Tom Hudson"
      }
    }
  }
]
    
por 04.05.2018 / 12:34
1

Jq é a ferramenta certa para processar dados JSON ( link ).

Amostra input.json :

{
  "data": {
    "div1": {
      "name": "some name",
      "age": 1,
      "address_1": "some address",
      "items": {
        "item_x1": "some data"
      }
    },
    "div2": {
      "name": "some other name",
      "age": 2,
      "address_2": "some address",
      "items": {
        "item_x2": "some data"
      }
    },
    "div3": {
      "name": "another name",
      "age": 3,
      "address_3": "some address",
      "items": {
        "item_x3": "some data"
      }
    }
  }
}
jq -c '"data" as $main_k | .data as $data | .data | to_entries
       | group_by(.key) | map(from_entries)[] | [paths(scalars)]
       | map(("\($main_k)." + join(".")) as $key
             | {($key): (reduce .[] as $k ($data; . = .[$k]))})
       | add' input.json

A saída:

{"data.div1.name":"some name","data.div1.age":1,"data.div1.address_1":"some address","data.div1.items.item_x1":"some data"}
{"data.div2.name":"some other name","data.div2.age":2,"data.div2.address_2":"some address","data.div2.items.item_x2":"some data"}
{"data.div3.name":"another name","data.div3.age":3,"data.div3.address_3":"some address","data.div3.items.item_x3":"some data"}
    
por 04.05.2018 / 14:11