Jump to content

google OCR response string


Recommended Posts

Hi there, 

Im using the google OCR textdetection. In the response I want to take out just all the written text. The response is coming like this: 

 

"text": "t"
                          }
                        ]
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "I´m the wanted\npart, of, the, respons\n1\nI´m the wanted part of the respons 2\nI´m the wanted part of the respons 3\nI´m the wanted part of the respons 4\n"
      }
    }
  ]
}

or 

"text": "g"
                          }
                        ]
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "I´m the wanted\npart of\nthe respons 1?\nI´m the wanted part of the respons 2\nI´m the wanted part of the respons 3\I´m the wanted part of the respons 4\n"
      }
    }
  ]
}

or 

"text": "h"
                          }
                        ]
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "I´m the wanted part of the\nresponse 1?\nI´m some unwanted crap\nI´m the wanted part of the respons 2\I´m the wanted part of the respons 3\nI´m the wanted part of the respons 4\n"
      }
    }
  ]
}

 

These are some examples how the respond parts can look like. My problem, that I just want a way that I only get the marked parts in this way.

$string1 = I´m the wanted part of the respons 1
$string2 = I´m the wanted part of the respons 2
$string3 = I´m the wanted part of the respons 3
$string4 = I´m the wanted part of the respons 4

So what is changing:

variable number of wordwraps (\n)

the length of every string can be between 1 and 99

between part1 and part2 can be some unwanted shit (looks always the same) 

 

Thanks for everybody, I know this looks totally wired. 

Is there maybe a function that can just deliver the recived text as so many strings as wordwraps are in there?

Link to comment
Share on other sites

It looks like the API response is standard JSON.  You can use the UDF found HERE to parse the JSON and get the text objects.  Or, if you are good with regular expressions, you can use them to parse out the text.  Once you've got the text, you can use StringSplit to separate the lines, base on the "\n", into an array.  From there, process the array items as you like.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...