Jump to content

manipulate large csv-file


dejhost
 Share

Recommended Posts

5 hours ago, TheXman said:

@dejhost

FYI,

Because you're dealing with such large datasets, it is easy to miss some of the details & data if you aren't careful.  For example, in the sample JSON file that you provided, there are 2 images that don't have any .Label object information.  Therefore, those 2 records did not show up in the CSV data.  If that's what was expected, then all is good.  If you expected to see all images, whether they had object information or not, then those types of records would need to be taken into account.  So if you noticed and were wondering why you were missing some of the input images in the CSV data, that's why.  :)

The 2 JSON objects without label object information:

{
  "ID": "ckjayp5lc00003a69wakzxsxg",
  "DataRow ID": "ckj9odjg37bi20rh6ei442ug7",
  "Labeled Data": "https://storage.labelbox.com/ckj9o1nvf6pqk0716vn24noz7%2F0bacdc01-bee2-3b16-627e-671b8f6c2a1a-DSC04905.jpg?Expires=1611474714892&KeyName=labelbox-assets-key-1&Signature=dvsZdXT3dbUeRBvORvVxtMhhyMA",
  "Label": {},
  "Created By": "victor@oasisoutsourcing.co.ke",
  "Project Name": "SubSeaScanning",
  "Created At": "2020-12-30T05:11:30.000Z",
  "Updated At": "2020-12-30T05:23:05.000Z",
  "Seconds to Label": 32.094,
  "External ID": "DSC04905.jpg",
  "Agreement": -1,
  "Benchmark Agreement": -1,
  "Benchmark ID": null,
  "Dataset Name": "Trial",
  "Reviews": [
    {
      "score": 1,
      "id": "ckjc8j0k60gk50yaw07n69wiu",
      "createdAt": "2020-12-31T02:34:25.000Z",
      "createdBy": "victor@oasisoutsourcing.co.ke"
    }
  ],
  "View Label": "https://editor.labelbox.com?project=ckj9obfp954gq0718tasdrinc&label=ckjayp5lc00003a69wakzxsxg"
}
{
  "ID": "ckjmlilco000039686bfxpf8k",
  "DataRow ID": "ckjm8xorhfbe80rj53hi0bg36",
  "Labeled Data": "https://storage.labelbox.com/ckj9o1nvf6pqk0716vn24noz7%2F4644e43b-5e58-ff51-d1e3-4551ed722d6f-n101_0408.jpg?Expires=1611474715610&KeyName=labelbox-assets-key-1&Signature=erTXaMYyBO56kurONJv6AVGt8zU",
  "Label": {},
  "Created By": "evans@oasisoutsourcing.co.ke",
  "Project Name": "SubSeaScanning",
  "Created At": "2021-01-07T08:31:15.000Z",
  "Updated At": "2021-01-07T08:31:16.000Z",
  "Seconds to Label": 219.925,
  "External ID": "n101_0408.jpg",
  "Agreement": -1,
  "Benchmark Agreement": -1,
  "Benchmark ID": null,
  "Dataset Name": "Aassgard Spool - Batch 1",
  "Reviews": [
    {
      "score": 1,
      "id": "ckjn49ska0p010yd17ws9eyyh",
      "createdAt": "2021-01-07T17:20:44.000Z",
      "createdBy": "victor@oasisoutsourcing.co.ke"
    }
  ],
  "View Label": "https://editor.labelbox.com?project=ckj9obfp954gq0718tasdrinc&label=ckjmlilco000039686bfxpf8k"
}

 

Thanks for the hint! There is no relevant information to be acquired for those two images right now. But I will need to doublecheck, why there are actually no lables. Maybe an error occured at an earlier stage.

I also realised that some images have a "(1)" in the filename - an indication for duplicates, which also might cause troubles later down the line. 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...