1

I've got an folder with PDF-files with names like

page_1_excercise_2.pdf
page_23_excercise_3_4_and_5.pdf
page_456_excercise_16_and_17.pdf

The numbers might contain 1, 2 or 3 digits. I need to generate HTML-code like:

<a href="pdfs/page_1_excercise_2.pdf">Page 1, Excercise 2</a>
<a href="pdfs/page_23_excercise_3_4_and_5.pdf">Page 23, Excercise 3, 4 and 5</a>
<a href="pdfs/page_456_excercise_16_and_17.pdf">Page 456, Excercise 16 and 17</a>

Is there a way to do so using a batch-file in Windows?

Here is what I was able to find so far:

@echo off
Setlocal EnableDelayedExpansion

for /f "tokens=*" %%i in ('dir /b /a:-d *.pdf') do (
set filename=%%i
set filename=!filename:.pdf=!
set filename=!filename:_= !
echo ^<a href="pdfs/%%i"^>!filename!^</a^> >> files.html)

It works fine, but I'd like to add a comma after each excercise number and convert the first letter (like "P" in "Page 456...") to a capital.

georgmierau
  • 131
  • 1
  • 8
  • 1
    You need to look into regex replace - see https://stackoverflow.com/questions/14856009/win-batch-regexp-search-and-replace and https://superuser.com/questions/339118/regex-replace-from-command-line – AutoBaker Jun 28 '22 at 09:13
  • 1
    beware batch file is a bit cruddy for what you're trying to do. If I was you I'd use javascript (I have node.js installed which can run js locally). It's far more versatile and you'll be able to do a lot more once you get the hang of it. https://nodejs.org/en/knowledge/file-system/how-to-search-files-and-directories-in-nodejs/ – AutoBaker Jun 28 '22 at 09:18
  • `"tokens=* delims=_."`. Then you may check does each separate variable is numeric with `SET /A tmp=%x%+0` - for non-numeric variable you'll see zero in `%tmp%`. – Akina Jun 28 '22 at 10:38

1 Answers1

4
@echo off && setlocal enableextensions enabledelayedexpansion

cd /d "%~dp0" && for %%i in ("%cd%")do set "_dir=%%~nxi"
set "_alf=A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z"

>.\Output.html (
    findstr /be ^<.*^> "%~f0"
    for /f tokens^=1-7*delims^=_ %%i in ('where .:*.pdf')do (
         call %:^) "%%~ni" "%%~k" _str_1 _str_2 "%_alf%" && if "%%~xl" == ".pdf" (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~nxl"^>!_str_1! %%~j, !_str_2! %%~nl^</a^>
        )else if "%%~xn" == ".pdf" (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~l_%%~m_%%~nxn"^>!_str_1! %%~j, !_str_2! %%~l %%~m %%~nn^</a^>
        )else if "%%~xo" == ".pdf" (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~l_%%~m_%%~n_%%~nxo"^>!_str_1! %%~j, !_str_2! %%~l, %%~m %%~n %%~no^</a^>
        )
    ) 
    echo\^</body^>&echo\^</html^>) & endlocal & goto :eof

%:^)
set "_str#1=%~1" && set "_str#2=%~2" && for %%i in (%~5)do (
     if /i "!_str#1:~0,1!" == "%%~i" set "%~3=%%~i!_str#1:~1!"
     if /i "!_str#2:~0,1!" == "%%~i" set "%~4=%%~i!_str#2:~1!"
    )

exit /b

<!doctype html>
<html>
<head>
<title>Our Funky HTML Page</title>
<meta name="description" content="Our first page">
<meta name="keywords" content="html tutorial template">
</head>
<body>

  • HTML file output:
<!doctype html>
<html>
<head>
<title>Our Funky HTML Page</title>
<meta name="description" content="Our first page">
<meta name="keywords" content="html tutorial template">
</head>
<body>
<a href="Q1728787/page_1_excercise_2.pdf">Page 1, Excercise 2</a>
<a href="Q1728787/page_23_excercise_3_4_and_5.pdf">Page 23, Excercise 3, 4 and 5</a>
<a href="Q1728787/page_456_excercise_16_and_17.pdf">Page 456, Excercise 16 and 17</a>
</body>
</html>

1. Enter the folder where you have the pdf files, or run bat in that same folder:

cd /d "%~dp0" 
:: or 
cd /d "D:\Full\Path\To\Your\PDFs\Folder"

2. Get only current folder name and save in variable:

for %%i in ("%cd%")do set "_dir=%%~nxi"

3. Defines a variable with capital letters in order to set used in a for loop delimited by comma:

set "_alf=A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z"

4. Redirect string processing block in html file:

>.\Output.html (
        ...
        ...
       )

5. Add the necessary html tags at the beginning of each line and after the last line of your bat and use a findstr /Begin /End <one or more characters> to filter them to your html file:

...

findstr /be ^<.*^> "%~f0"

...

<!doctype html>
<html>
<head>
<title>Our Funky HTML Page</title>
<meta name="description" content="Our first page">
<meta name="keywords" content="html tutorial template">
</head>
<body>

6. Use a for /f loop as command that lists only your files in your folder, and assume as delimiter _, feared by tokens from 1 to 8 1-7*:

  for /f tokens^=1-7*delims^=_ %%i in ('where .:*.pdf')do ...

7. Use an if to compare take actions for each possible token where you have its file extension, which will match for the composition of the string with or without a comma:

....)do ...  && if "%%~xl" == ".pdf" (
          echo  ...
        )else if "%%~xn" == ".pdf" (
          echo  ...
        )else if "%%~xo" == ".pdf" (
          echo  ...
        )

8. For all case where the extension is equal to .pdf, treat the strings inside a function to replace the first character to the Capital case and already taking your tokens and their names for each variable where you will save the strings to compose your desired output:

...)do call %:^) "%%~ni" "%%~k" _str_1 _str_2 "%_alf%" && if ... (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~nxl"^>!_str_1! %%~j, !_str_2! %%~nl^</a^>
        )else if ... (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~l_%%~m_%%~nxn"^>!_str_1! %%~j, !_str_2! %%~l %%~m %%~nn^</a^>
        )else if ... (
         echo;^<a href="%_dir%/%%~ni_%%~j_%%~k_%%~l_%%~m_%%~n_%%~nxo"^>!_str_1! %%~j, !_str_2! %%~l, %%~m %%~n %%~no^</a^>
        )

9. Take your string to page and excercise to a function, along with the alphabet variable _alf in capital, where in loop the replacement of the first character that matches true to the insesitive case in the use of if /i will execute the substring with replace:


...)do call %:^) "%%~ni" "%%~k" _str_1 _str_2 "%_alf%" ... (
     ....
     )

%:^)
set "_str#1=%~1" && set "_str#2=%~2" && for %%i in (%~5)do (
     if /i "!_str#1:~0,1!" == "%%~i" set "%~3=%%~i!_str#1:~1!"
     if /i "!_str#2:~0,1!" == "%%~i" set "%~4=%%~i!_str#2:~1!"
    )
...


Additional resources:

Io-oI
  • 7,588
  • 3
  • 12
  • 41