许多爬虫不能很好地与单页应用程序配合使用。您可以使用预渲染解决方案,例如https://prerender.io/ https://prerender.io/ or https://www.prerender.cloud/ https://www.prerender.cloud/。我在用https://www.prerender.cloud/ https://www.prerender.cloud/与 Netlify 一起使用,效果很好。
如果您想继续使用 Firebase Hosting,请尝试尽量减少机器人访问页面时调用的代码。不要加载任何库或运行任何需要渲染机器人和爬虫所需的标签和数据的东西。下面的例子。
索引.html
<script>
(function(w,d){
w.myApp = w.myApp || {}; w.myApp.robot = false;
var AM_I_ROBOTS = ['googlebot', 'twitterbot', 'facebookexternalhit', 'google.com/bot.html', 'facebook.com/externalhit_uatext.php', 'tweetmemebot', 'sitebot', 'msnbot', 'robot', 'bot', 'spider', 'crawl'];
var ua = navigator.userAgent.toLowerCase(); w.myApp.userAgent = ua;
for (var i = 0, len = AM_I_ROBOTS.length; i < len; i++) { if(ua.indexOf(AM_I_ROBOTS[i]) !== -1 ) { w.myApp.robot = true; break; }}
})(window,document);
</script>
<script>
if(!window.myApp.robot){
// Google Analytics code
}
</script>
<script>
if(!window.myApp.robot){
// Facebook Connect code
}
</script>
应用程序组件.ts
export class AppComponent implements OnDestroy, OnInit, AfterViewInit {
...
public webRobot: boolean = false;
private static AM_I_ROBOTS:[string] = ['googlebot', 'twitterbot', 'facebookexternalhit', 'google.com/bot.html',
'facebook.com/externalhit_uatext.php', 'tweetmemebot', 'sitebot', 'msnbot', 'robot',
'bot', 'spider', 'crawl'];
...
constructor(private auth: Auth,
private localStorage: LocalStorageService,
private meta: MetaService,
...
private otherService: OtherService,
) {
}
ngOnInit(): void {
this.init();
}
init() {
const robots = AppComponent.AM_I_ROBOTS;
const ua = navigator.userAgent.toLowerCase();
for (var i = 0, len = robots.length; i < len; i++) {
if(ua.indexOf(robots[i]) !== -1 ) {
this.webRobot = true;
break;
}
}
// for service that should be informed to
// run minimally with robots
this.otherService.init(this.webRobot);
// for service that should not be called with robots
if (!this.webRobot) {
this.auth.init();
// etc.
}
}