Regex | The British and American Style of Spelling

Source: https://yellorn.com/programming/regex-the-british-and-american-style-of-spelling

 American English and British English differ in several aspects which are reflected in their spelling. One difference frequently observed, is that words are written in American English, which have a suffix ze often end in se in British English. Given the American-English spelling of a word that ends in ze your task is to find the total count of all its British and American variants in all the given sequences of words. i.e. you need to account for the cases where the word occurs as it is given to you (i.e. the version ending in -ze) and you also need to find the occurrences of its British-English counterparts (i.e, the version ending in -se).

Input Format

First-line contains N, N lines follow each line contains a sequence of words (W) separated by a single space. The next line contains T. T testcases follow by a new line. Each line contains the American English spelling of a word (W’)

Constraints

1 <= N <= 10
Each line doesn’t contain more than 10 words (W)
Each character of W and W’ is a lowercase alphabet.
If C is the count of the number of characters of W or W’, then
1 <= C <= 20
1 <= T <= 10
W’ ends with ze ( US version of the word)

Output Format

Output T lines and in each line output the total number of American and British versions of (W’) in all of N lines that contains a sequence of words.

Sample Input

2
yellorn has such a good ui that it takes no time to familiarise its environment
to familiarize oneself with ui of yellorn is easy
1
familiarize

Sample Output

2

Explanation

In the given 2 lines, we find familiarize and familiarise once each. So, the total count is 2.

Testcase
Input

3
relationship shower paralyze catalyze quality calculation tie permanent colonise venture
trend determine catalyse wet groceries large driver temperature give caramelie
chew comfort supply possess award possess precisely colonize catalyze help
9
organize
materialize
hydrolyze
catalyze
realize
colonize
paralyze
caramelize
recognize

Output

0
0
0
3
0
2
1
0
0

Solution

import sys
import re
pattern = rf"\b{{}}[sz]e\b"
text = ''.join([sys.stdin.readline() for _ in range(int(input()))])
for _ in range(int(input())):
    word = sys.stdin.readline().strip()
    _pattern = pattern.format(word[:-2])
    print(len(re.findall(_pattern, text)))

Comments