Hi there,
The requirements look quite clear and straightforward to implement. MP3 file will be initially uploaded with a post request. I haven't worked with aws but google speech requires the audio to be formatted in a certain encoding and rate. So once the upload is complete we need to convert/encode the mp3 file to a suitable format. Also, google processes the audio sycnhronously if it's less than 1min. in length, anything longer will be processed asynchronously which means it'll be handled in a queue like manner.
In light of the above, and since you're looking for keywords in the transcription, we can split the mp3 file into 30sec. pieces, format each piece accordingly and then send it to google for processing getting back the result instantly. Finally, check the resulting transcript for keywords.
I am planning to use python to implement this along with a couple of packages to format/encode and split the file. I expec this to be ready in 3 days at most, thanks.