Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

madalina2311 · 2022-08-09T14:30:17Z

Hi,

I am using benthos 4.5.0 and i am using gcp_cloud_storage input and gcp_bigquery as an output (streaming mode).
I have a folder containing a bunch of json logs and my pipelines is batching those and sending them to BQ table.

input:
label: "gcs"
gcp_cloud_storage:
bucket: "benthos_storage"
prefix: "dev/StatusUpdate/2022/08/05/16"
codec: all-bytes
delete_objects: true

output:
label: "bq"
gcp_bigquery:
project: "dev"
dataset: "dev_status"
table: "StatusUpdate"
format: NEWLINE_DELIMITED_JSON
max_in_flight: 100
create_disposition: "CREATE_IF_NEEDED"
auto_detect: true
batching:
count: 100
byte_size: 0
period: 30s

The problem appears after all the files from the folder have been sent to BQ. The only error i get is "Readiness probe failed: HTTP probe failed with statuscode: 503" -my pod is still running; if i add a new file inside the folder it does not get delivered to BQ unless i restart the pod and it gets stuck again once all the newly added files are sent to BQ.

The same behaviour happens with "delete_objects: false".

Can you advise? Am I configuring something wrong?

mihaitodor · 2022-08-09T23:28:23Z

Thanks for raising this @madalina2311! It will require a some investigation, but, to help narrow it down, how did you determine that "all the files from the folder have been sent to BQ"? Benthos should shut down automatically for such inputs where there isn't anything left to read. Does your pod contain just the Benthos container or does it also have some sidecar which acts as a metrics proxy and perhaps reports 503 when Benthos shuts down. Do you see any relevant logs coming from the Benthos container?

PS: There's some work being done in #1363 to add a shutdown_delay config field, but it hasn't been merged yet.

madalina2311 · 2022-08-10T14:56:27Z

The folder contains events from nats (one event per file). I have a test stream and i know how many messages i send. i get the same number of json files inside the folder. On top of that, i have "delete_objects: true" (Whether to delete downloaded objects from the bucket once they are processed - so i guess this means that it will delete each processed event - aka added to BQ table ).

I do a count of events on BQ table and equals the number of json files (events i sent to nats). At this point the bucket is empty and i get "Readiness probe failed: HTTP probe failed with statuscode: 503" - but nothing relevant comes in the benthos logs - level=info msg="Inserting messages as objects to GCP BigQuery:development:nats:SingleCall" @service=benthos label=bq_singlecall path=root.output stream=bq_SingleCall"

The rest of the pipelines work (add logs to bucket from nats : nats-> gcs) despite the 503 message because I start generating events and i see how the messages appear in buckets. But the copy from gcs bucket to BQ does not resume unless i restart the pod.

Also, there is no sidecar deployed.

So I understand that this pipeline should shutdown once the last file from the bucket is processed (sent to BQ) and something has to restart it to resume processing (every x minutes) ? This is not like a watcher that will stream messages to BQ each time they appear in bucket? Is there something that i could potentially add to make my pipeline run on a continuous basis? I do know that streaming from nats to BQ works but the use case in this situation is gcs to BQ as i get multiple events inside the buckets (from other sources as well).

mihaitodor · 2022-08-10T23:34:00Z

So I understand that this pipeline should shutdown once the last file from the bucket is processed (sent to BQ) and something has to restart it to resume processing (every x minutes) ?

Yeah, that's what I'd expect to happen, but I need to set up a GCP account to test. Try testing it locally if you can and confirm. I wonder if you'll see anything extra in the logs if you set the level to ALL.

This is not like a watcher that will stream messages to BQ each time they appear in bucket?

Nope, it only lists the files in the bucket once and then starts reading the ones it discovered. It should shut down once it's done.

Is there something that i could potentially add to make my pipeline run on a continuous basis?

I guess you could try wrapping the input in a read_until input where you'd set restart_input: true and check: "false". However, it might try to list the bucket very frequently if it happens to be empty for a while, which I think can be fixed by using a sequence input in combination with a generate input and a sleep processor or maybe some simpler way that I can't think of right now... We could also add an extra config field to the read_until input to have it sleep a while before restarting the underlying input. I can try to experiment with it myself if you get stuck. Just let me know.

madalina2311 · 2022-09-15T09:15:06Z

Hi. I am adding new notes here as i pulled the latest version and ran again some tests . The following appears after all files have been processes . Note that in my situation (because i use delete_objects: true i will end up with an empty folder so that will automatically be deleted from gcs)

[> 2022/09/13 09:09:25 http: panic serving 12.0.0.86:39198: runtime error: invalid memory address or nil pointer dereference

25
goroutine 2492 [running]:
24
net/http.(*conn).serve.func1()
23
/usr/local/go/src/net/http/server.go:1850 +0xbf
22
panic({0x2ae91c0, 0x52f9870})
21
/usr/local/go/src/runtime/panic.go:890 +0x262
20
github.com/benthosdev/benthos/v4/internal/impl/pure.(*readUntilInput).Connected(0xc00271b110?)
19
/go/src/github.com/benthosdev/benthos/internal/impl/pure/input_read_until.go:235 +0x1c
18
github.com/benthosdev/benthos/v4/internal/stream.(*Type).IsReady(0xc000e66000)
17
/go/src/github.com/benthosdev/benthos/internal/stream/type.go:87 +0x32
16
github.com/benthosdev/benthos/v4/internal/stream/manager.(*StreamStatus).IsReady(...)
15
/go/src/github.com/benthosdev/benthos/internal/stream/manager/type.go:49
14
github.com/benthosdev/benthos/v4/internal/stream/manager.(*Type).HandleStreamReady(0xc000882870, {0x39f34b0, 0xc001730540}, 0xc000568ca0?)
13
/go/src/github.com/benthosdev/benthos/internal/stream/manager/api.go:611 +0x125
12
github.com/benthosdev/benthos/v4/internal/api.(*Type).RegisterEndpoint.func1({0x39f34b0, 0xc001730540}, 0xc003076a80?)
11
/go/src/github.com/benthosdev/benthos/internal/api/api.go:264 +0xcf
10
net/http.HandlerFunc.ServeHTTP(0xc00231a500?, {0x39f34b0?, 0xc001730540?}, 0x800?)
9
/usr/local/go/src/net/http/server.go:2109 +0x2f
8
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000635c80, {0x39f34b0, 0xc001730540}, 0xc00231a300)
7
/go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1cf
6
net/http.serverHandler.ServeHTTP({0xc003076900?}, {0x39f34b0, 0xc001730540}, 0xc00231a300)
5
/usr/local/go/src/net/http/server.go:2947 +0x30c
4
net/http.(*conn).serve(0xc000a42c80, {0x39f4e90, 0xc0029c8450})
3
/usr/local/go/src/net/http/server.go:1991 +0x607
2
created by net/http.(*Server).Serve
1
/usr/local/go/src/net/http/server.go:3102 +0x4db](url)

Jeffail · 2022-09-15T18:45:54Z

Hey @madalina2311 thanks for the trace, should be fixed with: 64eb723

mihaitodor added inputs Any tasks or issues relating specifically to inputs outputs Any tasks or issues relating specifically to outputs needs investigation It looks as though have all the information needed but investigation is required gcp Issues relating to GCP labels Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

madalina2311 commented Aug 9, 2022

mihaitodor commented Aug 9, 2022

madalina2311 commented Aug 10, 2022

mihaitodor commented Aug 10, 2022

madalina2311 commented Sep 15, 2022

Jeffail commented Sep 15, 2022

Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

Readiness probe failed: HTTP probe failed with statuscode: 503 after all files have been sent to bigquery #1369

Comments

madalina2311 commented Aug 9, 2022

mihaitodor commented Aug 9, 2022

madalina2311 commented Aug 10, 2022

mihaitodor commented Aug 10, 2022

madalina2311 commented Sep 15, 2022

Jeffail commented Sep 15, 2022