整理翻译过程中用到的正则表达式。
处理用pandoc转换而来的markdown
用途:把这个网页转换为markdown文件供BasicCAT翻译。
处理以下内容:
[activation]{#0010}
-------------------
A short communication between an installed program and the
manufacturer’s website. The program sends your serial number and a few
anonymous details about your computer. The website checks that you own a
license or that you are just starting a free trial, and returns a code
to authorize the program to run on your computer.
See also: [CAL license](#0140)
-
去掉开头和结尾都是
-
的行,使用以下表达式进行匹配^-+$
-
Markdown的段落需要用两个换行进行区分,段落可以是一行或多行文字。用pandoc转换后的markdown,段落里有多行文字。而我打算用BasicCAT以txt形式打开markdown文件,所以得把段落中的换行替换掉,这个操作英文叫做reflow。
-
把连续的换行符换成一个标记
(\r\n){2,}
,替换为“段落标记” -
去掉所有换行符为空格。markdown中段落里的换行相当于空格,不替换单词间会缺少空格
-
把“段落标记”替换回两个换行
-
-
去掉锚点,把
[activation]{#0010}
改为### activation
\[(.*?)\]\{#\d+\}
,替换为### \1
-
去掉链接,把
[CAL license](#0140)
改为CAL license
\[(.*?)\](#\d+)
,替换为\1
结果:
### activation
A short communication between an installed program and the manufacturer's website. The program sends your serial number and a few anonymous details about your computer. The website checks that you own a license or that you are just starting a free trial, and returns a code to authorize the program to run on your computer.
See also: CAL license