Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drive_download() error with KML mimeType #441

Open
caldwellst opened this issue Jul 21, 2023 · 4 comments
Open

drive_download() error with KML mimeType #441

caldwellst opened this issue Jul 21, 2023 · 4 comments

Comments

@caldwellst
Copy link

caldwellst commented Jul 21, 2023

There is a bug in drive_download() that causes it to fail on KML files depending on the mime type. KML mimeTypes are often text/xml or another raw text input. However, sometimes, depending on how they are loaded onto Drive, they are stored as application/vnd.google-earth.kml+xml.

Since drive_download() checks mimeType with grepl("google", mime_type), KML files are mistakenly assumed to be a directly support mimeType by Google Drive and error is generated from get_export_mime_type since it isn't recognised as a mime type.

#> Error in `get_export_mime_type()`:
#> ! Not a recognized Google MIME type:
#> ✖ application/vnd.google-earth.kml+xml
#> Run `rlang::last_trace()` to see where the error occurred.

I believe that this behavior is not ideal since downloading unsupported mime types should be okay, even if they aren't explicitly supported by the Google Drive API, since get_export_mime_type() is only checked if it appears to explicitly be a Google type. However, I may be wrong there.

Note that this code to check mime_type is also present in drive_read().

@jennybc
Copy link
Member

jennybc commented Jul 21, 2023

Can you provide a link to a file that is problematic? I.e. a way to actually experience the problem.

@caldwellst
Copy link
Author

caldwellst commented Jul 22, 2023

Apologies, yes, here's a simple reprex of the issue (ignoring authorisation for googledrive). Uses this publicly available KML file stored on Google Drive. I also manually downloaded the file and uploaded to my Drive to highlight that the default mimeType often works no problem, it's just simply if the mimeType is set to application/vnd.google-earth.kml+xml that an issue is raised. I'm not an expert in why that might be, but I believe it might occur when the KML is programmatically added to a drive directly from Google Earth Engine, for instance.

library(googledrive)
library(tidyverse)
library(sf)

drive_download(
    as_id("12bPHu0w8gyEmoeblBAg25c8ch842THw7")
)
#> Error in `get_export_mime_type()`:
#> ! Not a recognized Google MIME type:
#> ✖ application/vnd.google-earth.kml+xml

# pedantic, but we can check the mimeType is what's specified in the error:

original_dribble <- drive_get( as_id("12bPHu0w8gyEmoeblBAg25c8ch842THw7"))

original_dribble %>%
    pull(drive_resource) %>%
    pluck(1) %>%
    pluck("mimeType")
#> [1] "application/vnd.google-earth.kml+xml"

# and we can successfully download the file if we manually adjust the type

original_dribble$drive_resource[[1]]$mimeType <- "text/xml"
drive_download(
    file = original_dribble,
    path = f <- tempfile(fileext = ".kml")
)
#> File downloaded:
#> • 13 Colonies Template.kml <id: 12bPHu0w8gyEmoeblBAg25c8ch842THw7>
#> Saved locally as:
#> • /var/folders/b7/_6hwb39d43l71kpy59b_clhr0000gn/T//RtmpxkJtgm/file36a721e09e40.kml

# and we can successfully read the file, no issues

bypassed_sf <- read_sf(f)
bypassed_sf
#> Simple feature collection with 14 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XYZ
#> Bounding box:  xmin: -82.90712 ymin: 32.15744 xmax: -69.60236 ymax: 44.68772
#> z_range:       zmin: 0 zmax: 0
#> Geodetic CRS:  WGS 84
#> # A tibble: 14 × 3
#>    Name           Description                 geometry
#>    <chr>          <chr>                    <POINT [°]>
#>  1 Massachusetts  ""          Z (-69.60236 44.68772 0)
#>  2 Massachusetts  ""          Z (-71.38244 42.40721 0)
#>  3 Rhode Island   ""          Z (-71.47743 41.58009 0)
#>  4 Connecticut    ""          Z (-73.08775 41.60322 0)
#>  5 New Hampshire  ""           Z (-71.5724 43.19385 0)
#>  6 New York       ""          Z (-74.00597 40.71435 0)
#>  7 New Jersey     ""          Z (-74.42139 40.04444 0)
#>  8 Pennsylvania   ""          Z (-77.19452 41.20332 0)
#>  9 Delaware       ""          Z (-75.54199 39.18118 0)
#> 10 Maryland       ""          Z (-76.64127 39.04575 0)
#> 11 Virginia       ""          Z (-78.65689 37.43157 0)
#> 12 North Carolina ""           Z (-79.0193 35.75957 0)
#> 13 South Carolina ""          Z (-81.16372 33.83608 0)
#> 14 Georgia        ""          Z (-82.90712 32.15743 0)
#> Warning message:
#> In CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  :
#>   automatically selected the first layer in a data source containing more than one.

# check with the same file that I manually downloaded and then uploaded to my drive

drive_download(
    file = as_id("1VNPS4fILxODi7wrlzurJtNgF1V8FrBva"),
    path = g <- tempfile(fileext = ".kml")
)
#> File downloaded:
#> • 13 Colonies Template.kml <id: 1VNPS4fILxODi7wrlzurJtNgF1V8FrBva>
#> Saved locally as:
#> • /var/folders/b7/_6hwb39d43l71kpy59b_clhr0000gn/T//RtmpxkJtgm/file36a72adf180c.kml

# here we see this has a mimeType of text/xml

drive_get( as_id("1VNPS4fILxODi7wrlzurJtNgF1V8FrBva")) %>%
    pull(drive_resource) %>%
    pluck(1) %>%
    pluck("mimeType")
#> [1] "text/xml"

# just confirming they are the same file

copied_sf <- read_sf(g)
all.equal(copied_sf, bypassed_sf)
#> TRUE

@caldwellst
Copy link
Author

I had a look through the table of mime types at googledrive:::.drive$mime_tbl and noticing that some mimeTypes have what seems to be a "base type" added on at the end with +, much like the problematic KML mimeType above.

types <- googledrive:::.drive$mime_tbl$mime_type
types[grepl("\\+", types)]
#> [1] "application/epub+zip"                         
#> [2] "application/vnd.google-apps.script+json"      
#> [3] "application/vnd.google-apps.script+text/plain"
#> [4] "image/svg+xml"                                
#> [5] "image/svg+xml"   

Could a simple solution be, if there is a base mimeType specified, to exempt these from error generation? Again, stretching entirely my knowledge of mimeTypes but it seems that could be a robust and potentially future proof approach?

@jon23cooper
Copy link

I am having the same issue.
I used drive_download to download a kml file, and then I uploaded the file using drive_upload. In my google folder when I look at the file information, the file type has changed from the original XML to Unknown File, and drive_download will no longer download it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants