git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Go SDK: Bigquery and nullable field types.


I am using the bigqueryio transform and I am using the following struct to collect a data row:

type Record {
  source_service  biquery.NullString
  .. etc...
}

This works fine with the direct runner, but when I try it with the dataflow runner, then I get the following exception:
 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction -41: execute failed: bigquery: schema field source_service of type STRING is not assignable to struct field source_service of type struct { StringVal string; Valid bool }
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:55)
	at com.google.cloud.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:274)
	at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:83)
	at com.google.cloud.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:101)
	at com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
	at com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
	at com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
	at com.google.cloud.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:179)
	at com.google.cloud.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:107)
	Suppressed: java.lang.IllegalStateException: Already closed.
		at org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:97)
		at com.google.cloud.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:93)
		at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:89)
		... 6 more

Looks like the bigquery API is failing to detect the nullable type NullString, and instead is attempting a plain assignment. Could it be that some aspect of the type information has been lost thus preventing the bigquery API from identifying and handling NullString properly?