git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Read from a Google Sheet based BigQuery table - Python SDK


Not sure if this is helpful but you can also share Google Sheets with service accounts directly. I am solving a similar problem by using the Google SDK directly to pull the data from the sheet, then feeding it into Beam via Scio's parallelize functionality. My dataset is small so this worked for me.

On Wed, Jun 6, 2018 at 1:13 PM Chamikara Jayalath <chamikara@xxxxxxxxxx> wrote:


On Tue, Jun 5, 2018 at 9:56 PM Leonardo Biagioli <lbiagioli@xxxxxxxxxxx> wrote:

Hi Cham,

thanks but those pages are related to the authentication inside Google Cloud Platform services, I need to authenticate the job on Sheets… Since that the required scope is https://www.googleapis.com/auth/drive is there a way to pass it in the deployment phase of a Dataflow job?


I haven't tried this unfortunately so not sure if this will work. Are you able to run queries against your federated table using BQ dashboard (without using Dataflow) ? Also make sure that compute engine service account used by Dataflow job is properly authenticated (as mentioned in the document I provided). I recommend contacting Google cloud support for questions regarding BQ and Dataflow services.

- Cham
 

Thank you,

Leonardo

 

Da: Chamikara Jayalath <chamikara@xxxxxxxxxx>
Inviato: martedì 5 giugno 2018 19:26
A: user@xxxxxxxxxxxxxxx
Cc: dev@xxxxxxxxxxxxxxx
Oggetto: Re: Read from a Google Sheet based BigQuery table - Python SDK

 

See following regarding authenticating Dataflow jobs.

 

I'm not sure about information specific to sheets, seems like there's some info in following.

 

On Tue, Jun 5, 2018 at 10:16 AM Leonardo Biagioli <lbiagioli@xxxxxxxxxxx> wrote:

Hi Cham,

Thank you for taking time to answer!

Is there a way to authenticate properly a Beam job on Dataflow runner? I should specify the required scope to read from Sheets, but where I can set that parameter?

Regards,

Leonardo

 

Il 05 giu 2018 18:28, Chamikara Jayalath <chamikara@xxxxxxxxxx> ha scritto:

I don't think BQ federated tables support export jobs so reading directly from such tables likely will not work. But reading using a query should work if your job is authenticated properly  (I haven't tested this).

 

- Cham

 

On Tue, Jun 5, 2018, 5:56 AM Leonardo Biagioli <lbiagioli@xxxxxxxxxxx> wrote:

Hi guys,

just wanted to ask you if there is a chance to read from a Sheet based BigQuery table from a Beam pipeline running on Dataflow…

I usually specify additional scopes to use through the authentication when running simple Python code to do the same, but I wasn’t able to find a reference to something similar for Beam.

Could you please help?

Thank you very much!

Leonardo

 



--

Preston Marshall

Director, Data Engineering

www.cityblock.com

256-434-1050

preston@xxxxxxxxxxxxx

55 Washington St, Unit 552 Brooklyn, NY 1​1201