Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solana crashes when sysctl can't be accessed #35329

Open
erik78se opened this issue Feb 27, 2024 · 2 comments
Open

Solana crashes when sysctl can't be accessed #35329

erik78se opened this issue Feb 27, 2024 · 2 comments

Comments

@erik78se
Copy link

When I'm starting Solana on my baremetal server which has been hardened to disallow access to sysctl (for security related reasons).

Solana crashes.

[2024-02-27T10:25:59.958543482Z WARN  solana_perf] CUDA is disabled
[2024-02-27T10:25:59.958580863Z INFO  solana_perf] AVX detected
[2024-02-27T10:25:59.958587003Z INFO  solana_perf] AVX2 detected
[2024-02-27T10:26:00.282695872Z INFO  solana_validator] obtained shred-version 35459 from 139.178.68.207:8001
[2024-02-27T10:26:00.283062450Z INFO  solana_metrics::metrics] metrics disabled: environment variable not found
[2024-02-27T10:26:00.283299354Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.rmem_max: no such sysctl: net.core.rmem_max
[2024-02-27T10:26:00.283319252Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.optmem_max: no such sysctl: net.core.optmem_max
[2024-02-27T10:26:00.283330789Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.netdev_max_backlog: no such sysctl: net.core.netdev_max_backlog
[2024-02-27T10:26:00.283341876Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.wmem_max: no such sysctl: net.core.wmem_max
[2024-02-27T10:26:00.283352640Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.rmem_default: no such sysctl: net.core.rmem_default
[2024-02-27T10:26:00.283362472Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.wmem_default: no such sysctl: net.core.wmem_default
[2024-02-27T10:26:00.283370299Z WARN  solana_core::system_monitor_service]   vm.max_map_count: recommended=1000000 current=262144, too small
[2024-02-27T10:26:00.283376815Z WARN  solana_core::system_monitor_service]   net.core.rmem_max: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283384814Z WARN  solana_core::system_monitor_service]   net.core.optmem_max: recommended=0 current=-1, too small
[2024-02-27T10:26:00.283390150Z WARN  solana_core::system_monitor_service]   net.core.netdev_max_backlog: recommended=0 current=-1, too small
[2024-02-27T10:26:00.283397157Z WARN  solana_core::system_monitor_service]   net.core.wmem_max: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283401841Z WARN  solana_core::system_monitor_service]   net.core.rmem_default: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283406696Z WARN  solana_core::system_monitor_service]   net.core.wmem_default: recommended=134217728 current=-1, too small
OS network limit test failed. See: https://docs.solana.com/running-validator/validator-start#system-tuning

I would rather see the detection of these values handle this gracefully and not crash. Perhaps I can submit a patch that won't error out but simply warn?

fn linux_get_current_network_limits() -> Vec<(&'static str, &'static InterestingLimit, i64)> {

@steviez
Copy link
Contributor

steviez commented Feb 27, 2024

For the general case, this check is helpful and provides an immediate failure / useful error message. That being said, if you're setting up your system in a particular manner that disallows the check from working and you're confident that you're tuning it appropriately, you can bypass the check with:

--no-os-network-limits-test

solana/validator/src/main.rs

Lines 1719 to 1726 in 8ad125d

if !matches.is_present("no_os_network_limits_test") {
if SystemMonitorService::check_os_network_limits() {
info!("OS network limits test passed.");
} else {
eprintln!("OS network limit test failed. See: https://docs.solanalabs.com/operations/guides/validator-start#system-tuning");
exit(1);
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@erik78se @steviez and others