Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

Open
madalina2311 opened this issue Aug 9, 2022 · 5 comments
Labels
gcp Issues relating to GCP inputs Any tasks or issues relating specifically to inputs needs investigation It looks as though have all the information needed but investigation is required outputs Any tasks or issues relating specifically to outputs

Comments

@madalina2311
Copy link

Hi,

I am using benthos 4.5.0 and i am using gcp_cloud_storage input and gcp_bigquery as an output (streaming mode).
I have a folder containing a bunch of json logs and my pipelines is batching those and sending them to BQ table.

input:
label: "gcs"
gcp_cloud_storage:
bucket: "benthos_storage"
prefix: "dev/StatusUpdate/2022/08/05/16"
codec: all-bytes
delete_objects: true

output:
label: "bq"
gcp_bigquery:
project: "dev"
dataset: "dev_status"
table: "StatusUpdate"
format: NEWLINE_DELIMITED_JSON
max_in_flight: 100
create_disposition: "CREATE_IF_NEEDED"
auto_detect: true
batching:
count: 100
byte_size: 0
period: 30s

The problem appears after all the files from the folder have been sent to BQ. The only error i get is "Readiness probe failed: HTTP probe failed with statuscode: 503" -my pod is still running; if i add a new file inside the folder it does not get delivered to BQ unless i restart the pod and it gets stuck again once all the newly added files are sent to BQ.

The same behaviour happens with "delete_objects: false".

Can you advise? Am I configuring something wrong?

@mihaitodor mihaitodor added inputs Any tasks or issues relating specifically to inputs outputs Any tasks or issues relating specifically to outputs needs investigation It looks as though have all the information needed but investigation is required gcp Issues relating to GCP labels Aug 9, 2022
@mihaitodor
Copy link
Collaborator

Thanks for raising this @madalina2311! It will require a some investigation, but, to help narrow it down, how did you determine that "all the files from the folder have been sent to BQ"? Benthos should shut down automatically for such inputs where there isn't anything left to read. Does your pod contain just the Benthos container or does it also have some sidecar which acts as a metrics proxy and perhaps reports 503 when Benthos shuts down. Do you see any relevant logs coming from the Benthos container?

PS: There's some work being done in #1363 to add a shutdown_delay config field, but it hasn't been merged yet.

@madalina2311
Copy link
Author

The folder contains events from nats (one event per file). I have a test stream and i know how many messages i send. i get the same number of json files inside the folder. On top of that, i have "delete_objects: true" (Whether to delete downloaded objects from the bucket once they are processed - so i guess this means that it will delete each processed event - aka added to BQ table ).

I do a count of events on BQ table and equals the number of json files (events i sent to nats). At this point the bucket is empty and i get "Readiness probe failed: HTTP probe failed with statuscode: 503" - but nothing relevant comes in the benthos logs - level=info msg="Inserting messages as objects to GCP BigQuery:development:nats:SingleCall" @service=benthos label=bq_singlecall path=root.output stream=bq_SingleCall"

The rest of the pipelines work (add logs to bucket from nats : nats-> gcs) despite the 503 message because I start generating events and i see how the messages appear in buckets. But the copy from gcs bucket to BQ does not resume unless i restart the pod.

Also, there is no sidecar deployed.

So I understand that this pipeline should shutdown once the last file from the bucket is processed (sent to BQ) and something has to restart it to resume processing (every x minutes) ? This is not like a watcher that will stream messages to BQ each time they appear in bucket? Is there something that i could potentially add to make my pipeline run on a continuous basis? I do know that streaming from nats to BQ works but the use case in this situation is gcs to BQ as i get multiple events inside the buckets (from other sources as well).

@mihaitodor
Copy link
Collaborator

So I understand that this pipeline should shutdown once the last file from the bucket is processed (sent to BQ) and something has to restart it to resume processing (every x minutes) ?

Yeah, that's what I'd expect to happen, but I need to set up a GCP account to test. Try testing it locally if you can and confirm. I wonder if you'll see anything extra in the logs if you set the level to ALL.

This is not like a watcher that will stream messages to BQ each time they appear in bucket?

Nope, it only lists the files in the bucket once and then starts reading the ones it discovered. It should shut down once it's done.

Is there something that i could potentially add to make my pipeline run on a continuous basis?

I guess you could try wrapping the input in a read_until input where you'd set restart_input: true and check: "false". However, it might try to list the bucket very frequently if it happens to be empty for a while, which I think can be fixed by using a sequence input in combination with a generate input and a sleep processor or maybe some simpler way that I can't think of right now... We could also add an extra config field to the read_until input to have it sleep a while before restarting the underlying input. I can try to experiment with it myself if you get stuck. Just let me know.

@madalina2311
Copy link
Author

Hi. I am adding new notes here as i pulled the latest version and ran again some tests . The following appears after all files have been processes . Note that in my situation (because i use delete_objects: true i will end up with an empty folder so that will automatically be deleted from gcs)

[> 2022/09/13 09:09:25 http: panic serving 12.0.0.86:39198: runtime error: invalid memory address or nil pointer dereference

25
goroutine 2492 [running]:
24
net/http.(*conn).serve.func1()
23
/usr/local/go/src/net/http/server.go:1850 +0xbf
22
panic({0x2ae91c0, 0x52f9870})
21
/usr/local/go/src/runtime/panic.go:890 +0x262
20
github.com/benthosdev/benthos/v4/internal/impl/pure.(*readUntilInput).Connected(0xc00271b110?)
19
/go/src/github.com/benthosdev/benthos/internal/impl/pure/input_read_until.go:235 +0x1c
18
github.com/benthosdev/benthos/v4/internal/stream.(*Type).IsReady(0xc000e66000)
17
/go/src/github.com/benthosdev/benthos/internal/stream/type.go:87 +0x32
16
github.com/benthosdev/benthos/v4/internal/stream/manager.(*StreamStatus).IsReady(...)
15
/go/src/github.com/benthosdev/benthos/internal/stream/manager/type.go:49
14
github.com/benthosdev/benthos/v4/internal/stream/manager.(*Type).HandleStreamReady(0xc000882870, {0x39f34b0, 0xc001730540}, 0xc000568ca0?)
13
/go/src/github.com/benthosdev/benthos/internal/stream/manager/api.go:611 +0x125
12
github.com/benthosdev/benthos/v4/internal/api.(*Type).RegisterEndpoint.func1({0x39f34b0, 0xc001730540}, 0xc003076a80?)
11
/go/src/github.com/benthosdev/benthos/internal/api/api.go:264 +0xcf
10
net/http.HandlerFunc.ServeHTTP(0xc00231a500?, {0x39f34b0?, 0xc001730540?}, 0x800?)
9
/usr/local/go/src/net/http/server.go:2109 +0x2f
8
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000635c80, {0x39f34b0, 0xc001730540}, 0xc00231a300)
7
/go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1cf
6
net/http.serverHandler.ServeHTTP({0xc003076900?}, {0x39f34b0, 0xc001730540}, 0xc00231a300)
5
/usr/local/go/src/net/http/server.go:2947 +0x30c
4
net/http.(*conn).serve(0xc000a42c80, {0x39f4e90, 0xc0029c8450})
3
/usr/local/go/src/net/http/server.go:1991 +0x607
2
created by net/http.(*Server).Serve
1
/usr/local/go/src/net/http/server.go:3102 +0x4db](url)

@Jeffail
Copy link
Collaborator

Jeffail commented Sep 15, 2022

Hey @madalina2311 thanks for the trace, should be fixed with: 64eb723

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gcp Issues relating to GCP inputs Any tasks or issues relating specifically to inputs needs investigation It looks as though have all the information needed but investigation is required outputs Any tasks or issues relating specifically to outputs
Projects
None yet
Development

No branches or pull requests

3 participants
  NODES
COMMUNITY 2
INTERN 10
Note 2
Project 6
todo 3
USERS 1