How does Okera handle tables where some files are invalid? For example, where an avro file that doesn't match the schema. i.e. table is declared as map for this field but data is string.
In this situation, Okera default to not aborting on file parsing errors to try to make progress. Experience says that customers usually have a fraction of data that is bad (i.e. someone accidentally dropped a wrong file in a directory) and failing all reads to that table until this is fixed is very unappealing.
We do the hadoop default which is to skip and log, trying to make progress with what is good. Currently these errors just go to the worker log and not the client, which makes it hard to debug.