python | 无声独白

广电开始有电信业务了，号码归属地表又是一顿更新，如果跟不上步伐可能会引发一些事故。目前暂时本人暂时无法做到自己去爬数据，所以就只能借助网上大佬的数据库来使用了。找了一圈，找到了lovedboy大佬的github库。以及相对应的脚本。原本的使用方法应该是数据文件+脚本形成可以即时调用的方法，但无奈自家是写进mysql的，只能无奈将其导出。修改了大佬的原调用脚本，测试失败。😅 索性摆烂，将大佬脚本和需求丢进chatgpt跑一圈，出来以下脚本，试运行，丝滑。😲 放出来给大家看看。 # -*- coding: utf-8 -*- import os import struct import sys __author__ = 'lovedboy' if sys.version_info > (3, 0): def get_record_content(buf, start_offset): end_offset = buf.find(b'\x00', start_offset) return buf[start_offset:end_offset].decode() else: def get_record_content(buf, start_offset): end_offset = buf.find('\x00', start_offset) return buf[start_offset:end_offset] class Phone(object): def __init__(self, dat_file=None): if dat_file is None: dat_file = os.path.join(os.path.dirname(__file__), "phone.dat") with open(dat_file, 'rb') as f: self.buf = f.read() self.head_fmt = "<4si" self.phone_fmt = "<iiB" self.head_fmt_length = struct.calcsize(self.head_fmt) self.phone_fmt_length = struct.calcsize(self.phone_fmt) self.version, self.first_phone_record_offset = struct.unpack( self.head_fmt, self.buf[:self.head_fmt_length]) self.phone_record_count = (len(self.buf) - self.first_phone_record_offset) // self.phone_fmt_length def get_phone_dat_msg(self): print("版本号:{}".format(self.version)) print("总记录条数:{}".format(self.phone_record_count)) @staticmethod def get_phone_no_type(no): if no == 4: return "电信虚拟运营商" if no == 5: return "联通虚拟运营商" if no == 6: return "移动虚拟运营商" if no == 3: return "电信" if no == 2: return "联通" if no == 1: return "移动" @staticmethod def _format_phone_content(phone_num, record_content, phone_type): province, city, zip_code, area_code = record_content.split('|') return { "phone": phone_num, "province": province, "city": city, "zip_code": zip_code, "area_code": area_code, "phone_type": Phone.get_phone_no_type(phone_type) } def _parse_record(self, record): phone_num, record_offset, phone_type = struct.unpack(self.phone_fmt, record) record_content = get_record_content(self.buf, record_offset) return Phone._format_phone_content(phone_num, record_content, phone_type) def all(self): records = [] buflen = len(self.buf) for i in range(self.first_phone_record_offset, buflen, self.phone_fmt_length): record = self.buf[i: i + self.phone_fmt_length] records.append(self._parse_record(record)) return records def test(self): self.get_phone_dat_msg() records = self.all() for record in records: print(self.human_phone_info(record)) @staticmethod def human_phone_info(phone_info): if not phone_info: return '' return "{}|{}|{}|{}|{}|{}".format(phone_info['phone'], phone_info['province'], phone_info['city'], phone_info['zip_code'], phone_info['area_code'], phone_info['phone_type']) if __name__ == "__main__": phone = Phone() phone.test()

Read More ~

号码归属地库数据库的号码归属地是18年之前的，因为出现很多号码新号码不识是的问题，所以需要更新一遍。在此之前我想了下流程，主要是分以下几步。获取号码和区域代码（区号）制作区号代码字典（区号，省份，城市）根据区号补充号码的省份和城市（号码，区号，身份，城市）拼接插库语句获取号码和区域代码这里我采用很久之前看到的GitHub上一位大佬的库，使用PHP拉取3大运营商的号码和区号，我直接取用大佬拉好的数据。 [url]: https://github.com/chenxinbin/china-mobile-location #数据格式如下 1330010 010 1330011 010 1330018 021 制作区号代码字典根据我从之前数据库拉出来的区号信息制作字典。数据格式如下，并保存至txt文本中 00852 香港香港 010 北京北京 020 广东广州具体方法如下，让python逐行读取文本内容，分割文本之后存入list，然后再依次存入两个字典中。 #初始化参数 province = {} #创建空字典省份 city = {} #创建空字典城市 a_dist = open('./dist.txt', 'r', encoding='UTF-8') #读取文件，使用utf-8格式使用 #读取，分割，制作字典 for line in a_dist: lines = line.replace('\n', '') #可能存在换行符，影响结果，这里置换为空 split = lines.split('\t') #以制表符为分割点，分割 province[split[0]] = split[1] #添加区号查询的省份字典 city[split[0]] = split[2] #添加区号查询的城市字典 a_dist.close() 根据区号补充号码的省份和城市根据上面两个步骤得出的数据可以生成一份，号码-区号-省份-城市的对应数据。 a_10000 = ('./10000.txt', 'r', encoding='UTF-8') for number in a_10000: lines = line.replace('\n', '') split = lines.split(' ') #同理可得对应list #写入txt文件 try: w_txt = open('./dianxin.sql', 'a', encoding='UTF-8') # 'a' 为追加模式，请确保文本为空 w_txt.write('号码：%s，区号：%s，省份：%s，城市：%s' % (split[0], split[1], province[split[1]], city[a[1]])) excrpt: print(split[0]) #打印号码 print(split[1]) #打印区号 f_name.close() 拼接插库语句根据写入文件的方法自行修改 a_10000 = ('./10000.sql', 'r', encoding='UTF-8') for number in a_10000: lines = line.replace('\n', '') split = lines.split(' ') #同理可得对应list #写入txt文件 try: w_txt = open('./dianxin.sql', 'a', encoding='UTF-8') # 'a' 为追加模式，请确保文本为空 w_txt.write('INSERT into splitresplitNumber (Number,splitresplitNumber,Province,City) VsplitLUES (\'%s\',\'%s\',\'%s\',\'%s\');\n' % (split[0], split[1], province[split[1]], city[split[1]])) excrpt: print(split[0]) #打印号码 print(split[1]) #打印区号 f_name.close() 接着执行sql文件，就能愉快的添加了。原创文章转载请留言初学方法，比较费时费力。欢迎吐槽~

无声独白

号码归属地信息从dat文件导出成列表

号码归属地库

无声独白

记事本