I was asked to help with an agent that is able to detect the National ID of a user, so I expended some time testing the options and validating how it works.
The best approach seems to be leveraging an Entity defined to match the format of the ID, and then adding a single training phrase that uses this Entity to trigger the Intent that captures the ID.
This example is for the Spanish National ID (DNI or NIE), that has the form of 12345678A (DNI) or X1234567A (NIE), so it is a set of 8 numbers and one letter at the end, and if the individual was not born in Spain the first number is replaced by a letter.
The only issue is that while using Speech To Text, so in a voice enabled bot, the numbers might be passed to the NLU engine with spaces between the numbers … in any of the positions. Using Regular Expressions I ended up representing the format of the ID like this:
You can see that I’m taking into account only the valid letters both at the beginning and at the end, and the “\s?” indicates that there might be a space or not between the different digits.
Once this is done the Intent is pretty simple:
As you can see there is an input Context, as I don’t expect the user to say their ID in the middle of nowhere, but whenever they are in a flow where the bot requests it, for example with this Intent:
We also need to answer if the ID is not properly captured, for example if we are missing characters. My idea for that was to define a fallback intent with the same input Context. If the bot detects something that is not matching a valid ID, as represented by the regular expression, it will trigger this intent while in the defined Context:
My bonus track for this case is to verify that the ID is valid, as the last letter is kind of a checksum that we can easily check. For that I’m using a fulfillment for the in_requestDNI_valid intent. The ID is captured in the parameter USERDNI, and the code to perform the check is this one: (credits to Lois6b for the code in this post)
The code also cleans up the received ID, removing the spaces and using upper case characters for all the letters. This is stored in a parameter for later use in the bot.
I hope you find this useful.
I work for Google Cloud, but this post are personal ideas and opinions