openldap 适合大规模生产部署吗?

openldap 适合大规模生产部署吗?

For about 1 year we've been using openldap on ubuntu server 10.04LTS for authenticating about 20 IT users and everything has been running fine (the operations on the LDAP server were basically limited to creating/removing users using apache directory studio).

More recently (6 months ago) we've also started implementing openldap (openldap-2.4.21/debian) as an external authentication system for our website which is being migrated from an external CMS to a new platform we're developing in house using Drupal CMS. We have a 45K-user database and things haven't been going smoothly at all. Issues that we've had are:
-ldap crashing after a backup restore, needing to be recovered.
-the ldap recover tool unable to recover the ldap database on some occassions
-slapd consuming 100% CPU while no authentication activity on the website.

Due to lack of resources and knowledge internally, all we've done so far is to find ways of keeping LDAP running without really investigating any of these issues (use monit to restart it when it crashes, db_recover to recover the db if needed, and slapcat to recreate the db from scratch when db_recover fails).

Recently we've had a round of interviews to hire a Senior infrastructure engineer to assist us with all the various infra. issues we're running into. Several candidates confirmed they've either had or heard about issues with openldap in large production environments and never managed to come up with a single stable standalone openldap server but instead had to come up with redundant deployments (replication, load balancing, auto-recovery/restart routines) to keep ldap running. Some candidates even said that openldap just wasn't fit for production environments and that instead, using alternatives such as Novel eDirectory was necessary.

Q: If you have experience in dealing with ldap in production environments with thousands of users, do you have facts to share which tend to prove that openldap is indeed unstable for such setups and that using other ldap servers are indeed recommended?

答案1

I use OpenLDAP supporting a user-base of about 10,000 active users who rely on it throughout the day for everything. Problems are rare. Many services rely on it, for authentication and other things.

However, we have 4 read-only replicas (slaves/consumers) behind a load-balancer, a hidden master and a hot standby master. Used to be 2 front-end servers, but we had load problems during certain peak times (when 4,000 or so of those users were desperately trying to hit it at the same second). All write access to LDAP is via our code.

该设备和操作系统都很旧,我们正在努力用新设置替换它,该设置将恢复为只有 2 个副本(不执行太多其他操作)和 HA 配置中一对主服务器之间的“镜像模式”复制。同样,问题很少发生。

我们以前遇到过一些复制失败的问题,但这主要是因为我们使用 slurpd 而不是 syncrepl。此外,服务器的非正常关闭可能会损坏数据。

根据我的经验,在大规模生产环境中运行 OpenLDAP 的关键如下:

  1. 了解 LDAP 和 OpenLDAP 的人好吧。最好不止一个人。
  2. 熟悉基础设施所有其他直接相关部分的人。
  3. 有人知道OpenLDAP 复制作品。
  4. 合理理解BerkeleyDB 选项(或者您使用的任何后端),因为默认设置不太正确。
  5. 高可用性从属服务器. 大于1. 更好:真正负载平衡。
  6. **主动-被动主服务器(主动-主动主服务器复制本质上是比较棘手的)
  7. 我们将 LDAP 数据备份到每小时 LDIF并将几天的数据保存在磁盘上。(整个服务器每晚都会备份)
  8. 我们有脚本很快地把一个破碎的奴隶带回来到干净的当前数据副本
  9. 我们有脚本很快地恢复损坏的母版来自 LDIF 备份(通过 slapadd)
  10. 我们可以快速切换到备用主机.(脚本)
  11. 我们监控复制连接是否处于活动状态
  12. 我们监控所有从属服务器上的复制 ID 是否为最新
  13. 我们(较少)监控从服务器的全部内容是否与主服务器匹配。

不过,基本上,如果它是您基础设施的关键部分,那么您的团队中的某个人应该真正了解它。

附录:根据请求,DB_CONFIG文件来自我的 openldap DB 目录。查看http://docs.oracle.com/cd/E17076_02/html/api_reference/C/configuration_reference.html了解详情。

set_cachesize 0 536870912 1
set_flags DB_TXN_NOSYNC
set_flags DB_TXN_WRITE_NOSYNC
set_lg_regionmax 268435456
set_lg_max 536870912
set_lg_bsize 134217728

相关内容