Problem
Given a wordlist
, we want to implement a spellchecker that converts a query word into a correct word.
For a given query
word, the spell checker handles two categories of spelling mistakes:
Capitalization: If the query matches a word in the wordlist (
case-insensitive
), then the query word is returned with the same case as the case in the wordlist.
- Example:
wordlist = ["yellow"]
,query = "YellOw"
:correct = "yellow"
- Example:
wordlist = ["Yellow"]
,query = "yellow"
:correct = "Yellow"
- Example:
wordlist = ["yellow"]
,query = "yellow"
:correct = "yellow"
- Example:
Vowel Errors: If after replacing the vowels
1 | ('a', 'e', 'i', 'o', 'u') |
of the query word with any vowel individually, it matches a word in the wordlist (
case-insensitive
), then the query word is returned with the same case as the match in the wordlist.
- Example:
wordlist = ["YellOw"]
,query = "yollow"
:correct = "YellOw"
- Example:
wordlist = ["YellOw"]
,query = "yeellow"
:correct = ""
(no match) - Example:
wordlist = ["YellOw"]
,query = "yllw"
:correct = ""
(no match)
In addition, the spell checker operates under the following precedence rules:
- When the query exactly matches a word in the wordlist (case-sensitive), you should return the same word back.
- When the query matches a word up to capitlization, you should return the first such match in the wordlist.
- When the query matches a word up to vowel errors, you should return the first such match in the wordlist.
- If the query has no matches in the wordlist, you should return the empty string.
Given some queries
, return a list of words answer
, where answer[i]
is the correct word for query = queries[i]
.
Example 1:
1 | Input: wordlist = ["KiTe","kite","hare","Hare"], queries = ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"] |
Example 2:
1 | Input: wordlist = ["yellow"], queries = ["YellOw"] |
Constraints:
1 <= wordlist.length, queries.length <= 5000
1 <= wordlist[i].length, queries[i].length <= 7
wordlist[i]
andqueries[i]
consist only of only English letters.
Analysis
题目要求实现一个元音的拼写检查器,检查器可以允许两类错误:
- 大小写错误,即大小写不影响两个单词之间的匹配;
- 元音错误,即一个单词中元音字符可以相互匹配。
所以上面这两种情况都是可以成功匹配的,而且题目也给出了匹配的优先级,从高到低按照以下顺序排列:
- 完全匹配,即两个字符串要完全一样;
- 大小写匹配,即忽略大小写的不同,按照字母匹配;
- 元音匹配,即忽略元音字符的不同;
我们先来看每一种类型的匹配怎么实现,然后再看这个优先级怎么做。第一种非常简单,我们用一个set或者map存储wordlist
,然后对queries
中每一个单词寻找一下是否存在即可。
第二种是大小写的匹配,因为是忽略大小写,所以直接统一处理成小写即可,所以可以先把wordlist
中的单词都转换为小写存储到map中,然后对queries
中每一个单词也转换为小写,然后再看看是否存在map中。
第三个是元音的匹配,如果参照第二种匹配的思路,应该就是把元音位置上所有的可能性都找出来,然后存到map中,最后在匹配的时候也是把每一种情况都列举出来。这样做是可行的,但是复杂度非常高。比如google
这个单词,有三个元音字符,每个元音位置都有5种情况,所以一共会产生75个单词。这个数量太大了,我们要想办法优化。
有没有发现,按照上面这种做法,在预处理wordlist
的时候我们要全部找一遍,然后在匹配queries
中的单词的时候,我们还要全部找一遍,但是我们并不关心它最后能匹配到什么单词,所以元音位置上最终替换成什么我们是不关心的,那这里就参考了正则表达式的思想,用一个特殊字符代替元音就好。假设这里我用*
来代替元音字符,那么上面例子的google
就会成为g**gl*
,这个就是存储的key,value是google
。然后有一个待匹配的单词gaegle
,我们也采取同样的处理方式,处理为g**gl*
,这样就会发现这两个能匹配上了!
三种匹配我们都找到方法了,然后我们就要看优先级的问题。实际上因为这里优先级并不复杂,而且三种方法是各自独立的,为了解题的需要,每种匹配方法都各自使用独立的存储空间即可,在判断逻辑的时候有分先后就可以了。
Solution
精确匹配因为key和value都是一样的,所以就用了set,后面两个就用map。
Code
1 | class Solution { |
Summary
这道题是字符串类题目中比较负责和综合的题目,实际上拆解开每个部分都不难,但是综合起来考虑还是需要对字符串比较熟悉才行。这道题目还有一个值得注意的点是,利用了正则的思想去做模糊匹配。这道题目的分享到这里,感谢你的支持!