Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] Version 0.5.0 does not work when using an external dns service. #412

Closed
3 tasks done
BOBINIUNIU opened this issue Jan 4, 2024 · 39 comments · Fixed by #422
Closed
3 tasks done

[Bug Report] Version 0.5.0 does not work when using an external dns service. #412

BOBINIUNIU opened this issue Jan 4, 2024 · 39 comments · Fixed by #422

Comments

@BOBINIUNIU
Copy link

Checks

  • I have searched the existing issues
  • I have read the documentation
  • Is it your first time sumbitting an issue

Current Behavior

Version 5.0 does not work when using an external dns service. version 4.0 works fine.
-------------------------Log--------------------------
Jan 04 22:50:39 pve dae[1493]: level=info msg="192.168.1.6:59156 <-> 127.0.0.1:53" _qname=yahoo.com. dialer=direct dscp=0 mac="58:47:ca:72:64:15" network="udp4(DNS)" outbound=direct pid=0 >
Jan 04 22:50:41 pve dae[1493]: level=info msg="192.168.1.6:59158 <-> 127.0.0.1:53" _qname=yahoo.com. dialer=direct dscp=0 mac="58:47:ca:72:64:15" network="udp4(DNS)" outbound=direct pid=0 >
Jan 04 22:50:41 pve dae[1493]: level=info msg="192.168.1.79:59322 <-> 127.0.0.1:53" _qname=f-vali.cp31.ott.cibntv.net. dialer=direct dscp=0 mac="7a:1b:06:73:2c:0d" network="udp4(DNS)" outb>
Jan 04 22:50:42 pve dae[1493]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:45345: i/o timeout"

Expected Behavior

No response

Steps to Reproduce

No response

Environment

  • Dae version (use dae --version):
  • OS (e.g cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Others:

Anything else?

global {
log_level: error
lan_interface: enp2s0
wan_interface: enp2s0
allow_insecure: false
dial_mode: domain
}

node {
root_node: 'socks5://127.0.0.1:10000'
cloud_node: 'socks5://127.0.0.1:10001'
}

group {
root_group {
policy: fixed(0)
}
}

dns {
upstream {
coredns: 'udp://127.0.0.1:53'
}

routing {
request {
fallback: coredns
}
}
}

routing {
pname(coredns) && l4proto(udp) && dport(53) -> must_direct
pname(naive) -> must_direct
dport(53) -> direct

@dae-prow
Copy link
Contributor

dae-prow bot commented Jan 4, 2024

Thanks for opening this issue!

@jschwinger233
Copy link
Member

Hi, would you care to give #414 a shot? You can download statically linked binary from https://github.com/daeuniverse/dae/actions/runs/7431631804

@hyunrealshadow
Copy link

hyunrealshadow commented Jan 7, 2024

@jschwinger233 I have observed something very strange in this version.
In previous versions I found that ipv6 udp was not working
Running a tcp dump on the router gets the following packet
image
Looking in the issue, I found that #387 may have fixed the problem, so I tried the latest daily build and it didn't work for me
I tried this PR typed package, ipv6 works for me now, but ipv4 doesn't work now
image
Using WireShark to grab packets on my PC, I found that ipv4 is not getting any response, ipv6 can query dns normally
image
image

My configuration is as follows, I apologize if it's a configuration problem

global {
    ##### Software options.

    # tproxy port to listen on. It is NOT a HTTP/SOCKS port, and is just used by eBPF program.
    # In normal case, you do not need to use it.
    tproxy_port: 12345

    # Set it true to protect tproxy port from unsolicited traffic. Set it false to allow users to use self-managed
    # iptables tproxy rules.
    tproxy_port_protect: true

    # If not zero, traffic sent from dae will be set SO_MARK. It is useful to avoid traffic loop with iptables tproxy
    # rules.
    so_mark_from_dae: 0

    # Log level: error, warn, info, debug, trace.
    log_level: info

    # Disable waiting for network before pulling subscriptions.
    disable_waiting_network: false


    ##### Interface and kernel options.

    # The LAN interface to bind. Use it if you want to proxy LAN.
    # Multiple interfaces split by ",".
    lan_interface: br0

    # The WAN interface to bind. Use it if you want to proxy localhost.
    # Multiple interfaces split by ",". Use "auto" to auto detect.
    #wan_interface: eth0

    # Automatically configure Linux kernel parameters like ip_forward and send_redirects. Check out
    # https://github.com/daeuniverse/dae/blob/main/docs/en/user-guide/kernel-parameters.md to see what will dae do.
    auto_config_kernel_parameter: true


    ##### Node connectivity check.

    # Host of URL should have both IPv4 and IPv6 if you have double stack in local.
    # First is URL, others are IP addresses if given.
    # Considering traffic consumption, it is recommended to choose a site with anycast IP and less response.
    #tcp_check_url: 'http://cp.cloudflare.com'
    tcp_check_url: 'http://cp.cloudflare.com'

    # The HTTP request method to `tcp_check_url`. Use 'HEAD' by default because some server implementations bypass
    # accounting for this kind of traffic.
    tcp_check_http_method: HEAD

    # This DNS will be used to check UDP connectivity of nodes. And if dns_upstream below contains tcp, it also be used to check
    # TCP DNS connectivity of nodes.
    # First is URL, others are IP addresses if given.
    # This DNS should have both IPv4 and IPv6 if you have double stack in local.
    #udp_check_dns: 'dns.google.com:53'
    udp_check_dns: '1.1.1.1'

    check_interval: 30s

    # Group will switch node only when new_latency <= old_latency - tolerance.
    check_tolerance: 50ms


    ##### Connecting options.

    # Optional values of dial_mode are:
    # 1. "ip". Dial proxy using the IP from DNS directly. This allows your ipv4, ipv6 to choose the optimal path
    #       respectively, and makes the IP version requested by the application meet expectations. For example, if you
    #       use curl -4 ip.sb, you will request IPv4 via proxy and get a IPv4 echo. And curl -6 ip.sb will request IPv6.
    #       This may solve some wierd full-cone problem if your are be your node support that. Sniffing will be disabled
    #       in this mode.
    # 2. "domain". Dial proxy using the domain from sniffing. This will relieve DNS pollution problem to a great extent
    #       if have impure DNS environment. Generally, this mode brings faster proxy response time because proxy will
    #       re-resolve the domain in remote, thus get better IP result to connect. This policy does not impact routing.
    #       That is to say, domain rewrite will be after traffic split of routing and dae will not re-route it.
    # 3. "domain+". Based on domain mode but do not check the reality of sniffed domain. It is useful for users whose
    #       DNS requests do not go through dae but want faster proxy response time. Notice that, if DNS requests do not
    #       go through dae, dae cannot split traffic by domain.
    # 4. "domain++". Based on domain+ mode but force to re-route traffic using sniffed domain to partially recover
    #       domain based traffic split ability. It doesn't work for direct traffic and consumes more CPU resources.
    dial_mode: domain++

    # Allow insecure TLS certificates. It is not recommended to turn it on unless you have to.
    allow_insecure: false

    # Timeout to waiting for first data sending for sniffing. It is always 0 if dial_mode is ip. Set it higher is useful
    # in high latency LAN network.
    sniffing_timeout: 100ms

    # TLS implementation. tls is to use Go's crypto/tls. utls is to use uTLS, which can imitate browser's Client Hello.
    tls_implementation: tls

    # The Client Hello ID for uTLS to imitate. This takes effect only if tls_implementation is utls.
    # See more: https://github.com/daeuniverse/dae/blob/331fa23c16/component/outbound/transport/tls/utls.go#L17
    utls_imitate: chrome_auto
}

# Subscriptions defined here will be resolved as nodes and merged as a part of the global node pool.
# Support to give the subscription a tag, and filter nodes from a given subscription in the group section.
subscription {
}

# Nodes defined here will be merged as a part of the global node pool.
node {
    # Add your node links here.
    # Support socks5, http, https, ss, ssr, vmess, vless, trojan, tuic, juicity, etc.
    # Full support list: https://github.com/daeuniverse/dae/blob/main/docs/en/proxy-protocols.md
    proxy: 'socks5://127.0.0.1:7890'
}

# See https://github.com/daeuniverse/dae/blob/main/docs/en/configuration/dns.md for full examples.
dns {
}

# Node group (outbound).
group {
    proxy {
        policy: fixed(0)
    }
}

# See https://github.com/daeuniverse/dae/blob/main/docs/en/configuration/routing.md for full examples.
routing {
    ### Preset rules.

    # Network managers in localhost should be direct to avoid false negative network connectivity check when binding to
    # WAN.
    pname(NetworkManager) -> direct
    pname(dhclient) -> direct
    pname(dhcp6c) -> direct
    pname(naive) -> must_direct
    pname(sing-box) -> must_direct
    pname(tailscaled) -> must_direct
    pname(dotnet) -> must_direct

    # Put it in the front to prevent broadcast, multicast and other packets that should be sent to the LAN from being
    # forwarded by the proxy.
    # "dip" means destination IP.
    dip(100.64.0.0/10, 'fd7a:115c:a1e0::/48') -> must_direct

    # This line allows you to access private addresses directly instead of via your proxy. If you really want to access
    # private addresses in your proxy host network, modify the below line.
    dip(geoip:private) -> direct

    ### Write your rules below.
    dip(geoip:cn) -> direct

    fallback: proxy
}

@hyunrealshadow
Copy link

For additional information, my router system is VyOS, the kernel has been recompiled according to dae requirements, the DNS service is Technitium DNS.
Listening on port 53 with the following information

vyos@vyos:~$ sudo lsof -i :53
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
dotnet  7813 root  209u  IPv4 1079594      0t0  UDP *:domain
dotnet  7813 root  210u  IPv4 1079595      0t0  TCP *:domain (LISTEN)
dotnet  7813 root  211u  IPv6 1079596      0t0  UDP *:domain
dotnet  7813 root  212u  IPv6 1079597      0t0  TCP *:domain (LISTEN)

@jschwinger233
Copy link
Member

@hyunrealshadow Can I see dae's log for IPv4 DNS failure?

@hyunrealshadow
Copy link

@jschwinger233 Some of the logs are as follows, I don't see failures related to ipv4

Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:65097 <-> 10.10.0.1:53" _qname=79.ucp-ntfy.kaspersky-labs.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound=d>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:61424 <-> 10.10.0.1:53" _qname=79.ucp-ntfy.kaspersky-labs.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound=d>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:54729 <-> 10.10.0.1:53" _qname=spclient.wg.spotify.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound=direct p>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:58039 <-> 10.10.0.1:53" _qname=spclient.wg.spotify.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound=direct p>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:55494 <-> 10.10.0.1:53" _qname=mobile.events.data.microsoft.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:65176 <-> 10.10.0.1:53" _qname=mobile.events.data.microsoft.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound>
Jan 07 06:39:06 vyos dae[2662]: level=info msg="10.10.10.1:57719 <-> 10.10.0.1:53" _qname=mobile.events.data.microsoft.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" network="udp4(DNS)" outbound>
Jan 07 06:39:06 vyos dae[2662]: level=warning msg="handlePkt: failed to read from: [fd69:99a9:e::1]:53 (dialer: direct): read udp [::]:60995: i/o timeout"
Jan 07 06:39:06 vyos dae[2662]: level=warning msg="handlePkt: failed to read from: [fd69:99a9:e::1]:53 (dialer: direct): read udp [::]:57443: i/o timeout"

@jschwinger233
Copy link
Member

@hyunrealshadow You already tried #414 but still failed on IPv4 DNS, right? Can you set log level to trace and share the log again, thanks!

@hyunrealshadow
Copy link

@jschwinger233 Use #414 before ipv6 dns doesn't work, #414 ipv4 dns doesn't work
The trace logs are as follows

Jan 07 12:47:02 vyos dae[4364]: level=trace msg="Choose DNS path" choose="udp+6" ipversions=[6] l4protos=[udp] upstream="udp://[fd69:99a9:e::1]:53" use="[fd69:99a9:e::1]:53"
Jan 07 12:47:02 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.10.1:60978 <-> 10.10.0.1:53: tpstelemetry.tencent.com. AAAA"
Jan 07 12:47:02 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.10.1:60979 <-> 10.10.0.1:53: tpstelemetry.tencent.com. A"
Jan 07 12:47:02 vyos dae[4364]: level=trace msg=Accept question=[{tpstelemetry.tencent.com. 1 1}] upstream=asis
Jan 07 12:47:02 vyos dae[4364]: level=info msg="[fd69:99a9:e:0:e95f:7d99:1257:a569]:60979 <-> [fd69:99a9:e::1]:53" _qname=tpstelemetry.tencent.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" netw>
Jan 07 12:47:02 vyos dae[4364]: level=trace msg="Update DNS record cache" _qname=tpstelemetry.tencent.com. ans="tpstelemetry.tencent.com.(A): 113.240.75.252; tpstelemetry.tencent.com.(A): 113.240.7>
Jan 07 12:47:02 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.10.1:60979 <-> Cache: tpstelemetry.tencent.com. A"
Jan 07 12:47:03 vyos dae[4364]: level=trace msg=Accept question=[{tpstelemetry.tencent.com. 28 1}] upstream=asis
Jan 07 12:47:03 vyos dae[4364]: level=info msg="[fd69:99a9:e:0:e95f:7d99:1257:a569]:60978 <-> [fd69:99a9:e::1]:53" _qname=tpstelemetry.tencent.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" netw>
Jan 07 12:47:03 vyos dae[4364]: level=trace msg="Update DNS record cache" _qname=tpstelemetry.tencent.com. ans="tpstelemetry.tencent.com.(AAAA): 240e:97c:2f:2::5c" rcode=0
Jan 07 12:47:03 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.10.1:60978 <-> Cache: tpstelemetry.tencent.com. AAAA"
Jan 07 12:47:03 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.10.1:65213 <-> 10.10.0.1:53: baidu.com. AAAA"
Jan 07 12:47:03 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.10.1:65213 <-> Cache: baidu.com. AAAA"
Jan 07 12:47:06 vyos dae[4364]: level=info msg="10.10.10.1:59854 <-> 13.91.148.88:3544" dialer=proxy dscp=0 ip="13.91.148.88:3544" mac="d8:bb:c1:6f:b0:8c" network=udp4 outbound=proxy pid=0 pname= p>
Jan 07 12:47:08 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.1.6:59364 <-> 10.10.0.1:53: api.miwifi.com. A"
Jan 07 12:47:08 vyos dae[4364]: level=trace msg="Request to DNS upstream" question=[{api.miwifi.com. 1 1}] upstream=asis
Jan 07 12:47:08 vyos dae[4364]: level=trace msg="Choose DNS path" choose="udp+4" ipversions=[4] l4protos=[udp] upstream="udp://10.10.0.1:53" use="10.10.0.1:53"
Jan 07 12:47:08 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.1.6:59364 <-> 10.10.0.1:53: api.miwifi.com. AAAA"
Jan 07 12:47:08 vyos dae[4364]: time="2024-01-07T12:47:08Z" level=trace msg="Port in use, fallback to use netns." from="10.10.0.1:53" realTo="10.10.1.6:59364" to="10.10.1.6:59364"
Jan 07 12:47:08 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.1.6:59364 <-> Cache: api.miwifi.com. AAAA"
Jan 07 12:47:08 vyos dae[4364]: level=trace msg=Accept question=[{api.miwifi.com. 1 1}] upstream=asis
Jan 07 12:47:08 vyos dae[4364]: level=info msg="10.10.1.6:59364 <-> 10.10.0.1:53" _qname=api.miwifi.com. dialer=direct dscp=0 mac="88:c3:97:c8:0e:29" network="udp4(DNS)" outbound=direct pid=0 pname>
Jan 07 12:47:08 vyos dae[4364]: level=trace msg="Update DNS record cache" _qname=api.miwifi.com. ans="api.miwifi.com.(A): 161.117.95.80" rcode=0
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:51690 <-> [fd69:99a9:e::1]:53: api.k8slens.dev. A"
Jan 07 12:47:10 vyos dae[4364]: time="2024-01-07T12:47:10Z" level=trace msg="Port in use, fallback to use netns." from="[fd69:99a9:e::1]:53" realTo="[fd69:99a9:e:0:e95f:7d99:1257:a569]:51690" to="[>
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:51690 <-> Cache: api.k8slens.dev. A"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:55459 <-> [fd69:99a9:e::1]:53: api.k8slens.dev. AAAA"
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:55459 <-> Cache: api.k8slens.dev. AAAA"
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="Rewrite dial target to domain" from="45.60.49.15:443" to="api.k8slens.dev:443"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="outbound: <Control Plane Routing> => <index: 2>"
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="Rewrite dial target to domain" from="45.60.49.15:443" to="api.k8slens.dev:443"
Jan 07 12:47:10 vyos dae[4364]: level=info msg="10.10.10.1:59035 <-> api.k8slens.dev:443" dialer=proxy dscp=0 ip="45.60.49.15:443" mac="d8:bb:c1:6f:b0:8c" network=tcp4 outbound=proxy pid=0 pname= p>
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.1.6:45615 <-> 10.10.0.1:53: www.baidu.com. A"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.1.6:45615 <-> 10.10.0.1:53: www.baidu.com. AAAA"
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.1.6:45615 <-> Cache: www.baidu.com. AAAA"
Jan 07 12:47:10 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.1.6:45615 <-> Cache: www.baidu.com. A"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:51690 <-> [fd69:99a9:e::1]:53: dc.services.visualstudio.com. A"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Request to DNS upstream" question=[{dc.services.visualstudio.com. 1 1}] upstream=asis
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Choose DNS path" choose="udp+6" ipversions=[6] l4protos=[udp] upstream="udp://[fd69:99a9:e::1]:53" use="[fd69:99a9:e::1]:53"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:55459 <-> [fd69:99a9:e::1]:53: dc.services.visualstudio.com. AAAA"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Request to DNS upstream" question=[{dc.services.visualstudio.com. 28 1}] upstream=asis
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Choose DNS path" choose="udp+6" ipversions=[6] l4protos=[udp] upstream="udp://[fd69:99a9:e::1]:53" use="[fd69:99a9:e::1]:53"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.10.1:51690 <-> 10.10.0.1:53: dc.services.visualstudio.com. A"
Jan 07 12:47:10 vyos dae[4364]: level=trace msg="Received UDP(DNS) 10.10.10.1:55459 <-> 10.10.0.1:53: dc.services.visualstudio.com. AAAA"
Jan 07 12:47:11 vyos dae[4364]: level=trace msg=Accept question=[{dc.services.visualstudio.com. 28 1}] upstream=asis
Jan 07 12:47:11 vyos dae[4364]: level=info msg="[fd69:99a9:e:0:e95f:7d99:1257:a569]:55459 <-> [fd69:99a9:e::1]:53" _qname=dc.services.visualstudio.com. dialer=direct dscp=0 mac="d8:bb:c1:6f:b0:8c" >
Jan 07 12:47:11 vyos dae[4364]: level=trace msg="Update DNS record cache" _qname=dc.services.visualstudio.com. ans="dc.services.visualstudio.com.(CNAME): dc.applicationinsights.microsoft.com." rcod>
Jan 07 12:47:11 vyos dae[4364]: level=debug msg="UDP(DNS) 10.10.10.1:55459 <-> Cache: dc.services.visualstudio.com. AAAA"
Jan 07 12:47:11 vyos dae[4364]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:51690 <-> [fd69:99a9:e::1]:53: dc.services.visualstudio.com. A"

@jschwinger233
Copy link
Member

@hyunrealshadow Really appreciate! I just pinpointed a bug thanks to your log, and pushed another commit to #414: f1c2111

Would you like to download the latest build and try again?

@hyunrealshadow
Copy link

@jschwinger233 Unfortunately, I tried the latest version and it still doesn't work
As soon as I run systemctl stop dae, the ipv4 dns lookup works fine, and incidentally, since I'm configuring the br0 interface, the router local lookup is fine

Jan 07 13:05:01 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.10.1:56626 <-> 10.10.0.1:53: baidu.com. A"
Jan 07 13:05:01 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.10.1:56626 <-> Cache: baidu.com. A"
Jan 07 13:05:02 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:39581 <-> 10.10.0.1:53: api.miwifi.com. A"
Jan 07 13:05:02 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:39581 <-> Cache: api.miwifi.com. A"
Jan 07 13:05:02 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:39581 <-> 10.10.0.1:53: api.miwifi.com. AAAA"
Jan 07 13:05:02 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:39581 <-> Cache: api.miwifi.com. AAAA"
Jan 07 13:05:02 vyos dae[5166]: level=trace msg="Received UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:55493 <-> [fd69:99a9:e::1]:53: tpstelemetry.tencent.com. AAAA"
Jan 07 13:05:02 vyos dae[5166]: level=debug msg="UDP(DNS) [fd69:99a9:e:0:e95f:7d99:1257:a569]:55493 <-> Cache: tpstelemetry.tencent.com. AAAA"
Jan 07 13:05:03 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.10.1:56627 <-> 10.10.0.1:53: baidu.com. AAAA"
Jan 07 13:05:03 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.10.1:56627 <-> Cache: baidu.com. AAAA"
Jan 07 13:05:04 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:40186 <-> 10.10.0.1:53: www.baidu.com. A"
Jan 07 13:05:04 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:40186 <-> Cache: www.baidu.com. A"
Jan 07 13:05:04 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:40186 <-> 10.10.0.1:53: www.baidu.com. AAAA"
Jan 07 13:05:04 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:40186 <-> Cache: www.baidu.com. AAAA"
Jan 07 13:05:05 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.10.1:56628 <-> 10.10.0.1:53: baidu.com. A"
Jan 07 13:05:05 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.10.1:56628 <-> Cache: baidu.com. A"
Jan 07 13:05:06 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:50422 <-> 10.10.0.1:53: www.taobao.com. A"
Jan 07 13:05:06 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:50422 <-> Cache: www.taobao.com. A"
Jan 07 13:05:06 vyos dae[5166]: level=trace msg="Received UDP(DNS) 10.10.1.6:50422 <-> 10.10.0.1:53: www.taobao.com. AAAA"
Jan 07 13:05:06 vyos dae[5166]: level=debug msg="UDP(DNS) 10.10.1.6:50422 <-> Cache: www.taobao.com. AAAA"

@jschwinger233
Copy link
Member

@hyunrealshadow thanks! It looks like DNS reply was dropped by kernel as no errors were found from dae. I wish I could reproduce the case on my environment 🫠

@jschwinger233
Copy link
Member

BIG kudos to @hyunrealshadow who recompiled his or her kernel for tracing, with patience to run all the commands I asked for.
We confirmed a bug regarding ARP failure, and I can't know it without @hyunrealshadow 's kind help!
Next I'm going to hardcode arp cache in the "daens" namespace to resolve all ARP issues once and for all.

@jschwinger233
Copy link
Member

@hyunrealshadow Would you like to try #414 again? I queued a commit (e447d24) to solve L2 problems.

@hyunrealshadow
Copy link

Thank you so much @jschwinger233 for #414!
IPv4 and IPv6 DNS lookups now work fine on my device!

@jschwinger233
Copy link
Member

Thanks @hyunrealshadow !

Still, I don't know if #414 fixes the original issue:

Jan 04 22:50:42 pve dae[1493]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:45345: i/o timeout"

@BOBINIUNIU would you like to try the PR build?

@BOBINIUNIU
Copy link
Author

Thanks @hyunrealshadow !

Still, I don't know if #414 fixes the original issue:

Jan 04 22:50:42 pve dae[1493]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:45345: i/o timeout"

@BOBINIUNIU would you like to try the PR build?

I pulled the code from this branch and recompiled the new version, unfortunately it still doesn't work.
https://github.com/jschwinger233/dae/tree/gray/fix-udp-port-conflict

Log
Jan 10 20:44:54 pve dae[1690]: level=info msg="192.168.1.6:57855 <-> 127.0.0.1:53" _qname=client.wns.windows.com. dialer=direct dscp=0 mac="58:47:ca:72:64:15" network="udp4(DNS)" outbound=>
Jan 10 20:44:55 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:49972: i/o timeout"
Jan 10 20:44:55 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:55190: i/o timeout"
Jan 10 20:44:55 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:46306: i/o timeout"
Jan 10 20:44:56 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:58517: i/o timeout"
Jan 10 20:44:56 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:42415: i/o timeout"
Jan 10 20:44:56 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:57302: i/o timeout"
Jan 10 20:44:57 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:34050: i/o timeout"
Jan 10 20:44:57 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:55192: i/o timeout"
Jan 10 20:44:57 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:60361: i/o timeout"
Jan 10 20:44:57 pve dae[1690]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:47049: i/o timeout"

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU

nft 'insert rule inet firewalld filter_INPUT mark 0x08000000 accept'
nft 'insert rule inet fw4 input mark 0x08000000 accept'

Execute the above commands and try again.

@BOBINIUNIU
Copy link
Author

@BOBINIUNIU

nft 'insert rule inet firewalld filter_INPUT mark 0x08000000 accept'
nft 'insert rule inet fw4 input mark 0x08000000 accept'

Execute the above commands and try again.

root@pve:~# nft 'insert rule inet firewalld filter_INPUT mark 0x08000000 accept'
Error: Could not process rule: No such file or directory
insert rule inet firewalld filter_INPUT mark 0x08000000 accept

root@pve:~# nft 'insert rule inet fw4 input mark 0x08000000 accept'
Error: Could not process rule: No such file or directory
insert rule inet fw4 input mark 0x08000000 accept

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU nft list ruleset

@BOBINIUNIU
Copy link
Author

@BOBINIUNIU nft list ruleset

Still no luck

root@pve:# nft list ruleset
root@pve:
# nft 'insert rule inet firewalld filter_INPUT mark 0x08000000 accept'
Error: Could not process rule: No such file or directory
insert rule inet firewalld filter_INPUT mark 0x08000000 accept

root@pve:~# nft 'insert rule inet fw4 input mark 0x08000000 accept'
Error: Could not process rule: No such file or directory
insert rule inet fw4 input mark 0x08000000 accept

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU 你是 debian 吗,有防火墙吗

@BOBINIUNIU
Copy link
Author

@BOBINIUNIU 你是 debian 吗,有防火墙吗

PVE 直接安装, 内核版本Linux 6.5.11-4-pve,防火墙关闭

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU 你是 debian 吗,有防火墙吗

PVE 直接安装, 内核版本Linux 6.5.11-4-pve,防火墙关闭

我们没有类似的环境,可以有你的联络方式吗(telegram 或者 email,可以直接发送到我的邮箱 [email protected]

@BOBINIUNIU
Copy link
Author

@BOBINIUNIU 你是 debian 吗,有防火墙吗

PVE 直接安装, 内核版本Linux 6.5.11-4-pve,防火墙关闭

我们没有类似的环境,可以有你的联络方式吗(telegram 或者 email,可以直接发送到我的邮箱 [email protected]

PVE也是使用的Debian内核。我在Debian 12下测试也是相同的结果。我可以把coredns和dae的配置文件都发给你测试

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU 好的,我来试一下能不能复现

@BOBINIUNIU
Copy link
Author

@BOBINIUNIU 好的,我来试一下能不能复现

所有文件已发送到你的邮箱,请查收

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 10, 2024

@BOBINIUNIU 好的,我来试一下能不能复现

所有文件已发送到你的邮箱,请查收

我没有收到你的邮件,垃圾箱也是空的

@mzz2017

This comment was marked as resolved.

@BOBINIUNIU

This comment was marked as resolved.

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 11, 2024

@BOBINIUNIU 我们新修了一个问题,你可以再试试 #422 的最新 commit 吗

@mzz2017 mzz2017 changed the title [Bug Report] Version 5.0 does not work when using an external dns service. [Bug Report] Version 0.5.0 does not work when using an external dns service. Jan 11, 2024
@BOBINIUNIU
Copy link
Author

问题已修复,感谢各位的辛勤工作。

@BOBINIUNIU
Copy link
Author

问题仍然没有完全修复,重启动一次后问题再次出现了
Jan 11 23:19:44 pve dae[724]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:50995: i/o timeout"
Jan 11 23:19:45 pve dae[724]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:38093: i/o timeout"
Jan 11 23:19:45 pve dae[724]: level=warning msg="handlePkt: failed to read from: 127.0.0.1:53 (dialer: direct): read udp [::]:42168: i/o timeout"

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 11, 2024

@BOBINIUNIU 不知道是不是同一个问题,你可以:

创建一个文件 /etc/sysctl.d/99-dae.conf,文件内容为:

net.ipv4.conf.dae0.rp_filter=0

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 11, 2024

不过你的这个日志应该不是这个问题,你看看你的 dns server 有没有监听 53?

sudo lsof -i:53

sudo netstat -ulpen|grep 53

@BOBINIUNIU
Copy link
Author

我昨天的测试流程有误,仅仅重启了服务而未重启系统清理DNS缓存。问题实际上没有修复。
除了替换dae,所有的配置都未改变,替换回0.4版就没问题。

53端口
root@pve:~# lsof -i:53
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
coredns 620 nobody 3u IPv6 24127 0t0 TCP *:domain (LISTEN)
coredns 620 nobody 7u IPv6 24130 0t0 UDP *:domain

99-sysctl.conf
vm.swappiness = 1
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

@wanlce
Copy link
Contributor

wanlce commented Jan 12, 2024

@BOBINIUNIU

  1. 添加 dport(53) -> must_direct 看看是否正常
  2. 注释掉 dport(53) -> must_direct , 在 /etc/sysctl.d 创建一个 conf 填写, 重启后看看是否正常
net.ipv4.conf.dae0.rp_filter=0

@BOBINIUNIU
Copy link
Author

BOBINIUNIU commented Jan 12, 2024

@BOBINIUNIU

  1. 添加 dport(53) -> must_direct 看看是否正常
  2. 注释掉 dport(53) -> must_direct , 在 /etc/sysctl.d 创建一个 conf 填写, 重启后看看是否正常
net.ipv4.conf.dae0.rp_filter=0

两种方法均无效。所有配置文件都发给了mzz2017,应该是可以重现这个问题吧

@BOBINIUNIU
Copy link
Author

BOBINIUNIU commented Jan 13, 2024

更多测试
替换文件后不重启系统,只重启服务,程序可以工作。但是出现下面的warning msg
Jan 13 19:13:05 pve dae[215111]: level=warning msg="handlePkt: failed to GetOrCreate: proxy: SOCKS5 proxy at 127.0.0.1:10000 failed to connect: command not supported by socks5 server: UDPA>
Jan 13 19:13:05 pve dae[215111]: level=warning msg="handlePkt: failed to GetOrCreate: proxy: SOCKS5 proxy at 127.0.0.1:10000 failed to connect: command not supported by socks5 server: UDPA>
重启系统后程序就无法工作了。

直连的DNS是正常的,通过代理发送的DOH请求全部无法工作 1.1.1.1/dns-query
希望这些信息有所帮助

@mzz2017
Copy link
Contributor

mzz2017 commented Jan 14, 2024

command not supported by socks5 server: UDPAssociate

这是你的 socks5 服务器(dae 连接的那个节点)不支持 udp,正常报错。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants