Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Lambda and API Gateway request IDs to Cloudwatch logs when there's a PHP-FPM issue #865

Closed
wants to merge 5 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 44 additions & 7 deletions src/Event/Http/FpmHandler.php
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@ final class FpmHandler extends HttpHandler
/** @var Process|null */
private $fpm;

/** @var string */
private $lambdaRequestId;
/** @var string */
private $apiGatewayRequestId;

public function __construct(string $handler, string $configFile = self::CONFIG)
{
$this->handler = $handler;
Expand Down Expand Up @@ -79,7 +84,7 @@ public function start(): void
$this->fpm->setTimeout(null);
$this->fpm->start(function ($type, $output): void {
// Send any PHP-FPM log to CloudWatch
echo $output;
$this->logToCloudwatch($output);
aran112000 marked this conversation as resolved.
Show resolved Hide resolved
});

$this->client = new Client;
Expand Down Expand Up @@ -108,16 +113,19 @@ public function __destruct()
*/
public function handleRequest(HttpRequestEvent $event, Context $context): HttpResponse
{
$this->lambdaRequestId = $context->getAwsRequestId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we mutating the handler state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is exactly where I stumbled when working on this locally.

I don't really want to store state in the class, but then that means passing it everywhere.

This led me to realize that in most cases there is no context associated to the log lines, since we boot PHP-FPM before any request comes in.

In the end, I'm not 100% sure about the approach, and #895 may be able to remove the source issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh, I agree #895 or something like it, is really the best option here.

$this->apiGatewayRequestId = $event->getRequestContext()['requestId'] ?? null;

$request = $this->eventToFastCgiRequest($event, $context);

try {
$response = $this->client->sendRequest($this->connection, $request);
} catch (Throwable $e) {
printf(
"Error communicating with PHP-FPM to read the HTTP response. A root cause of this can be that the Lambda (or PHP) timed out, for example when trying to connect to a remote API or database, if this happens continuously check for those! Bref will restart PHP-FPM now. Original exception message: %s %s\n",
$this->logToCloudwatch(sprintf(
'Error communicating with PHP-FPM to read the HTTP response. A root cause of this can be that the Lambda (or PHP) timed out, for example when trying to connect to a remote API or database, if this happens continuously check for those! Bref will restart PHP-FPM now. Original exception message: %s %s',
get_class($e),
$e->getMessage()
);
));

// Restart PHP-FPM: in some cases PHP-FPM is borked, that's the only way we can recover
$this->stop();
Expand Down Expand Up @@ -221,7 +229,7 @@ private function killExistingFpm(): void

// Never seen this happen but just in case
if ($pid <= 0) {
echo "PHP-FPM's PID file contained an invalid PID, assuming PHP-FPM isn't running.\n";
$this->logToCloudwatch("PHP-FPM's PID file contained an invalid PID, assuming PHP-FPM isn't running.");
unlink(self::SOCKET);
unlink(self::PID_FILE);
return;
Expand All @@ -243,12 +251,12 @@ private function killExistingFpm(): void
return;
}

echo "PHP-FPM seems to be running already. This might be because Lambda stopped the bootstrap process but didn't leave us an opportunity to stop PHP-FPM (did Lambda timeout?). Stopping PHP-FPM now to restart from a blank slate.\n";
$this->logToCloudwatch("PHP-FPM seems to be running already. This might be because Lambda stopped the bootstrap process but didn't leave us an opportunity to stop PHP-FPM (did Lambda timeout?). Stopping PHP-FPM now to restart from a blank slate.");

// The previous PHP-FPM process is running, let's try to kill it properly
$result = posix_kill($pid, self::SIGTERM);
if ($result === false) {
echo "PHP-FPM's PID file contained a PID that doesn't exist, assuming PHP-FPM isn't running.\n";
$this->logToCloudwatch("PHP-FPM's PID file contained a PID that doesn't exist, assuming PHP-FPM isn't running.");
unlink(self::SOCKET);
unlink(self::PID_FILE);
return;
Expand Down Expand Up @@ -282,4 +290,33 @@ private function getResponseHeaders(ProvidesResponseData $response): array
{
return array_change_key_case($response->getHeaders(), CASE_LOWER);
}

/**
* Log a message to Cloudwatch, this is specific for FPM related errors and will include AWS Request IDs for the
* current Lambda invocation and API Gateway (where applicable) to aid debugging any FPM-related issues.
*/
private function logToCloudwatch(string $message): void
{
echo trim($message) . ' ' . json_encode($this->getLogContext()) . PHP_EOL;
}

/**
* Returns an array with AWS Request IDs for the current Lambda invocation and API Gateway (where applicable)
*
* @return string[]|array
*/
private function getLogContext(): array
{
$logContext = [];

if ($this->lambdaRequestId) {
$logContext['lambdaRequestId'] = $this->lambdaRequestId;
aran112000 marked this conversation as resolved.
Show resolved Hide resolved
}

if ($this->apiGatewayRequestId) {
$logContext['apiGatewayRequestId'] = $this->apiGatewayRequestId;
}

return $logContext;
}
}