In our first weblog, we used an everyday expression to switch the quotes in genres
. Afterward, we had been in a position to UNNEST() the JSON object. We’ll be working with the identical knowledge set on this weblog
In our knowledge:
Embedded content material: https://gist.github.com/nfarah86/ef1cc9da88e56226c4c46fd0e3c8e16e
there’s a JSON string that’s known as spoken_languages, and it’s formatted equally to genres:
[ { "spoken_languages": "[{'iso_639_1': 'fr', 'name': 'Français'}]" }]
Assuming every thing is constant, we are able to simply write the SQL assertion just like what we wrote for genres
— proper?
Unsuitable ⛔️
We really get a parsing error:
json parse error on line 0 close to `x9akai”}, {“iso_’: unknown escape 120 in string
So, we type of know what the wrongdoer is right here, however we don’t know if there are going to be extra parsing errors. Let’s go forward and debug this, so we are able to get this SQL assertion working!
We will really use REGEXP_LIKE() to see what precisely is inflicting the error:
SELECT
spoken_languages
FROM
commons.TwtichMovies t
WHERE REGEXP_LIKE(t.spoken_languages,'x9akai')
It is a pattern of what we get again:
That is nice— it appears to be like just like the are literally inflicting some points. We will use REGEXP_REPLACE() to switch these slashes with an empty string:
SELECT (REGEXP_REPLACE('Lietuvix9akai', '',''));
Now— to place all of it collectively: How will we UNNEST() spoken_languages and repair the two points at hand (the string format and the double slashes)?
Trace:
SELCT REGEXP_REPLACE(REGEXP_REPLACE('[{''iso_639_1'': ''lt'', ''name'':''Lietuvix9akai''}]', '''', '"'), '', '');
I’m positive you’ll be able to take it to the end line from right here, however simply in case, you’ll be able to watch this youtube hyperlink down under to catch the total replay!
Embedded content material: https://youtu.be/8aHgJrQjT4U
TLDR: you’ll find all of the assets you want within the developer nook.