このThe article states:PADでInvoice PDFConvert to textMethod andJavaScriptTrial RunTimeAddictedPointsLet me introduce it to you!
The invoice ismonthly,Monthly processingbecause there isIt's tough, isn't it?OwnIn conversationTo put it simply,AWSFrom the invoice,Each account (each business division)How much eachDid you use it?I have to tally it upshould notHowever,そのThe invoice PDF isDozens of pages per fileDozens of filesbecause there isUntil it's automated,QuiteIt was hard.
For example,AWSInvoice PDFPADでConvert it into text,moreoverRegular expressionsUse it,It was a littleProcessingIf you do,Like thisThe tsv file looks like thisIn one shotYou can output it.
An AWS account
Usage fees
consumption tax
hogehoge-AmiVoice00 (012345678900)
$1,234,567.89
$123,456.78
hogehoge-AmiVoice01 (012345678901)
$123,456.78
$12,345.67
...
...
...
hogehoge-AmiVoice99 (012345678999)
$0.12
$0.01
The regular expression isInvoiceBy typecreateNeedThere is,Once you make it,MonthlyIt's a hassleIt's gone!
What is PAD?Power Automate DesktopIn abbreviation,RPA (Robotic Process Automation)ToolsThere is one.MicrosoftFrom June 2021Free of chargeIt offers.PADIf you use it,ProgrammingThe SunI don't have oneEven if youNo-code/low-code developmentAutomation appssimplyYou can make it. PAD isDownload HereAvailable
What you will be able to do with practice
・PDFText(※ In this articleHandwritten PDFsIt's unexpected)
・JavaScript on PADExecution(The best partIt's published)
・Daily workAutomation (otherimportantTo workI can devote myself toTimewill increase)
List of things you need to put into practice
・PADCan be executedEnvironment(At the time of writingVersion: 2.23.114.22217)
・PDF you want to convert to text
1. Explanation of the overall flow
The completed overall flow looks like this
At firstA rough outline from the start of processing to the end of processingThe flowLet me explain.
Overall flow of converting PDF invoices to text
Action ①: In the folderfileObtain
⇒In the specified folderA PDFI will search.
Action 2: For each (loop)
⇒For each PDFWithin the loopActionI repeat.
Action 3: From PDFTextExtraction
⇒PDFConvert it to text.
Action ⑤: Regular ExpressionEscape Text
⇒JavaScriptTrial RunWheninconvenientSpecial charactersReplace.
Action ⑥: JavaScriptExecution
⇒JavaScriptRun.(HereRegular expressions(Will be described)
Action 7: Textto the fileWrite
⇒JavaScriptExecution resultsto the fileyou save.
Action 8: Filemobile
⇒Text PDFIn the specified folderMove
Action 9: End (Loop)
⇒PDFOtherIf there is, go to ②i will be back.If notLoop end.
simplyTo summarize,In the specified folderAll PDFs availableText(JavaScriptExecution resultsSave) andProcessingThe finished PDFanotherIn the folderMove..
2. Explanation of each action and tricky points
Action ①: Get files in a folder
First of all,The PDF you want to convert to textListLet's get it.
■Action "Folder"⇒ "Get files in a folder"
Folder: Get files in a folder
"Folder":
Enter the path (save location) of the PDF you want to convert to text.Specify.Examples include:AWS billAWS Inc.It is specified.
"File Filter":
このIn the column *.pdf andWhen you enterThe extension isIt is a .pdfAll ofFilePath listYou can get
"Flow variable Files":
このto a variableBy searchHitFilePath listenter.
with thisThe path list of the PDF you want to convert to textYou can get
Action 2: For each (loop)
next,The PDF you want to convert to text isThere were severalIn caseIn preparation,For each PDFProcessingTo repeatLoopLet's set it up.
■Action "Loop"⇒『For each』
Loop: For each
"Iterationconductvalue":
PDF Path listSpecify.このIn the columnThe flow variables %Files% andEnter.PADActionInsideVariablesSpecifyTime,Both ends % insandwichedVariable namesEnterNeed
"Flow variable CurrentItem":
このto a variableThe path of one PDF isenter.ProcessingEach time it is repeated,In the following PDF pathThey will be replaced.
with thisFor each PDFProcessingconductReadyI did it.
Action 3: Extract text from PDF
Then.Back to the main topicCome in,PDFLet's turn it into text.
■Action 『PDF』⇒ "Extract text from PDF"
PDF: Extract text from PDF
"PDF file":
PDF PathSpecify.このIn the columnThe flow variables %CurrentItem% andEnter.
"Flow variable ExtractedPDFText":
このto a variableFrom PDFExtractedText informationenter.Actually,This is all you need to create a PDFText conversionCompleted.
Simply download the invoice PDFTextI just want toIn the case of,[Action 7: Textto the fileWrite it downToIt can be moved.later,Set the flow variable %ExtractedPDFText% to"WriteText"Enter,Save locationJust decide.
After texting,CSV format andTSVIn formatI want to fix itIn the case ofscript(This time with JavaScript)Regular expressionsWhen usedConvenient.
Action 4: Replace text
PAD with JavaScript etc.ScriptTrial Runin some cases,Addictive pointsA fewIt exists.そのPitfallsAvoidfor,It was a littleIngenuityIs required.In this action,JavaScriptWhen executingInconvenient charactersReplace.In this example,To the flow variable %ExtractedPDFText%included " ofDeleting.
■Action "Text"⇒ 『Replace text』
Text: Replace text
"Analyzetext":
Text informationContainsVariablesSpecify.このIn the columnThe flow variables %ExtractedPDFText% andEnter.
"search fortext":
PlaceCan be replacedTextSpecify.このIn the column " andEnter.(One double quote.)In text information " and 'BothincludeIn either caseOne sideIf you don't delete itIn the scriptそのIf the variableI can't give it to you.Reason,[Action ⑥: JavaScriptexecution】でto introduce.This time " ofHow todeleteforI'm typing.
"PlaceexchangePrevioustext":
PlaceexchangeTextSpecify.このIn the column %"% andEnter.(Two single quotes.)this is,I want to delete(NothingAbsentTo the stateI want to replaceIn caseUse.このIn the columnNothingDid not enterTime,belowIn the error(Parameter 'replacement text':It cannot be empty.)
"Flow variable Replaced":
このto a variableThe deletion processCompletedText informationenter.
with thisIn text information"Included"You can delete it.
Action 5: Escape text from regular expressions
Addictive pointsIt is,There are still.Another twistIs required.In this action,JavaScriptWhen executingUnwanted special charactersReplace.
■Action "Text"⇒ 『Regular expression escape text』
Text: Regular expression escape text
"Escapetext":
EscapeTextSpecify.このIn the columnFlow variables %Replaced% andEnter.In text informationIn lettersNot yetLine breaksincludeIf the scriptそのIf the variableI can't give it to you.hereLine breakOf lettersTo newline \nSubstitution (Escape) international success.[Action: TextIn [Replace], \n toCannot be replaced.
"Flow variable EscapedText":
このto a variableThe replacement processCompletedText informationenter.
with thisIn text informationincludedLine breakOf lettersTo a new lineYou can replace it.
Action ⑥: Execute JavaScript
JavaScript, etc.ScriptTrial RunPre-processingAt lastNow that it's completed,Be carefulRewrite JavaScriptLet's do it.
■Action "Script"⇒『Running JavaScript』
Script: Execute JavaScript
"JavaScript to run":
JavaScriptSource codeSpecify.このIn the columnJavaScriptSource code andEnter.The source code isInvoiceTo the specificationsdepending onRegular expressionsPlease make it.
On PADDefinedNumeric typeVariablesJavaScripthand overTime,%hogehoge%I can give it to you,Text typeVariablesWhen handing it over,Both ends " or ' insandwichedVariable name, "%hogehoge%" or '%hogehoge%'hand overNeedLike theseDid not describeTime,belowIn the errorMicrosoft JScript compilation error: CharactercorrectlyThere is no text information. " butIf it isn't included, "%hogehoge%"' butIf it isn't included, '%hogehoge%'I can give it to you.
【Action 4: TextReplace】でAvoided,In text information " and 'BothincludeIfEitherOne sideIf you don't delete itIn the errorBecomeThe reason is,Both ends " or ' insandwichedBy variable namehand overNeedbecause there isBothIncludedTime,belowIn the errorMicrosoft JScript compilation error: ';'There is none. etc.)
JavaScriptInsideVariableDeclarationconductTime,var , let, constcan not use.これらをDescribedTime,belowIn the error(Microsoft JScript Compilation Error: ';' to do so.Wait)
JavaScriptExecution resultsOutputfor,WScript.Echo() and WScript.StdOut.Write()available.WScript.Echo()With line breaksIn the output,WScript.StdOut.Write()Without line breaksOutput.
"Flow variable JavascriptOutput":
このto a variableJavaScriptExecution result(Normal)enter.
"Flow variable ScriptError":
このto a variableJavaScripterror contents(In abnormal situations)enter.
Now you can use JavaScript in PADTrial Run
Action 7: Write text to a file
Then,JavaScriptExecution resultsto the fileLet's save it.
■Action "Text"Write text to a file
Text: Write text to a file
"File Path":
Save locationPath (including file name)Specify.Examples include:AWS billAWS Inc. tsvIt is specified.* Either tsv or csv formatRegular expressionsWhen usedsimplyYou can make it.
"Writetext":
I want to saveText informationSpecify.このIn the columnThe flow variables %JavascriptOutput% andEnter.※JavaScript (regular expressions)Without usingDoneIn the case ofこのIn the columnThe flow variables %ExtractedPDFText% andEnter.
"newLineAdd:
Text informationWritingcrowdedlaterLine breakCan I put it in?Specify.Even onEven offDo as you like.
"The fileexistcase":
ExistingContentsOverwrite it orAt the endWhether to addSpecify.This timeLoopInsideWriteBecause it is a specification,ContentsAddSelect.
with thisI want to saveText informationto the fileYou can save it.
Action 8: Move files
The converted PDFanotherIn the folderI want to move and manage it.In the case ofthisUse.
■Action "File"⇒ 『Move files』
File: Moving a File
"Movefile":
PDF PathSpecify.このIn the columnThe flow variables %CurrentItem% andEnter.
"Destination Folder":
moveFilePath (save location)Specify.Examples include:AWS billAWS Inc._ProcessedIt is specified.
"The fileexistcase":
ExistingContentsOverwrite it orDo nothingSpecify.Basically,OverwriteSelect.(As you like)
with thisThe converted PDFanotherIn the folderYou can move.
Action 9: End (Loop)
"Action 2: For each"When set,AutomaticallyIt will be added.PDF that has not been converted to textOtherIf there is, go to ②Go backLoop.All PDFsProcessing is complete,The loop (whole flow)It's finished.
Final thoughts
How was it?Were you able to convert it to text?First timeThe settings are:A littleIt may have been difficult, butこれらをIn business flowdepending onIf you customize it,MonthlyInvoice processingIt can be done with one clickthink.
That minion
So that most of the work can be done just by clicking.
We are constantly experimenting in the areas of business automation, RPA, and MA.
For engines, please try the latest End to End. If you would like an engine specialized for industries such as healthcare, insurance, or finance, please select Hybrid and choose your language model and acoustic model.
Select the language, language model, and acoustic model you want to try, press the "Start speech recognition" button, and start speaking. You can speak freely or read out example sentences and the recognition results will be displayed.
*If you change the language, language model, acoustic model, or use application, please press the "Start speech recognition" button again.
Points to note
By using this service, you are deemed to have accepted the terms of use.
Supported browsers are the latest versions of Windows (Chrome, Firefox, Microsoft Edge), Android (Chrome, Firefox, Microsoft Edge), Mac (Safari), and iOS (Safari).
Depending on the browser, the recording quality may be poor. If you feel that the quality is not good, please try using a different compatible browser.
Please enable the microphone function in your browser's security settings.
The trial time for this service is 1 seconds per trial. If you would like to use it again, please click "Start speech recognition" button.
Voice data and text data may be used to improve speech recognition performance.