12

I'm trying to bring up a PCB that uses an STM32F407 and LAN8720A Ethernet PHY, and I can't seem to receive any Ethernet frames — even though I have no problem transmitting frames.

Hardware setup

Schematic of Ethernet PHY I have a 25 MHz crystal on the STM32F4, driving a 25 MHz clock output pin into the LAN8720A, which is in REF_CLK_OUT mode — and drives a 50 MHz clock back to the STM32F4 as part of the RMII interface.

The jack/magnetics are a generic part. Here's the datasheet: enter image description here

Software

I'm using the latest-update STM32CubeMX to generate a System Workbench for STM32 project that contains FreeRTOS, lwIP, plus the ETH peripheral drivers. I haven't really touched any of the generated code — so the lwIP stack gets initialized inside a FreeRTOS stack.

Experiments

With my board's lwIP configured for a 10.0.0.2 static IP, and a USB-to-ethernet dongle on my computer configured for a 10.0.0.1 static IP, I connect the two devices directly with an Ethernet cable, and my board attempts to connect to a service on port 80 of the computer. I capture the interaction between my board and the computer using Wireshark (running on the computer, and bound to the USB-to-Ethernet converter).

Because of the no-received-frames problem, we never get past this ARP stuff: Wireshark capture As you can see, the Stmicroe (my board) can send ARP packets — heard by my computer — but it never seems to hear the response from my computer, as it keeps blasting out ARP packets.

Both devices are configured with a 255.255.255.0 mask, and both are configured with a gateway address of 10.0.0.1 (the computer). I've heard of ARP tables getting screwed up and computers ignoring ARP packets, but I can't imagine the board would ignore ARP packets specifically addressed to it by my computer — in response to the requests the board made in the first place.

So, I dive into lwIP's ethernetif.c file and notice that HAL_ETH_GetReceivedFrame_IT(&heth) is returning an error. That function returns an error because (heth->RxDesc->Status & ETH_DMARXDESC_OWN) == 0, instead of 1. I interpret that to mean that the DMA buffers are currently armed for the MAC peripheral, and haven't received anything yet.

Furthermore, I've verified that the HAL_ETH_IRQHandler never gets called.

A problem with the PHY?

At this point, I suspected my PHY itself was to blame.

To investigate further, I attached my Saleae Logic Pro 16 to all the relevant signals, and noticed there's plenty of traffic on both the TX0/TX1, as well as the RX0/RX1 lines. Here's a capture of some RX traffic with the 25 MHz input clock:

Capture of received packet

RX_ERR is low the entire time, unless I attempt to capture the 50 MHz clock output (which is obviously challenging with a device like the Saleae): in that case, RX_ERR is ocassionally blipped high for a few packets (which is actually a good sign — the pin appears to be functioning).

Next steps

I've tried manually enabling ETH interrupts by calling HAL_NVIC_EnableIRQ(ETH_IRQn); after tcpip_init() is called in the MX_LWIP_Init() task, and that doesn't seem to fix the problem. I'm not entirely sure the Ethernet interrupt routine is even supposed to be getting called — that's the challenging thing with bringing up a brand new design; I'm struggling to determine what the proper behavior of the system would be, so I can then determine how my setup differs.

While I've used the STM32/STM32CubeMX/FreeRTOS stuff before, I've never used the STM32's Ethernet peripheral, and my only experience with this stuff is on custom embedded Linux systems, which always seemed to just work out of the box. This is new territory for me!

I'm sure there's a stupid checkbox somewhere or magical Ethernet_EnableReceive() function I forget to call, but I can't really find any documentation that suggests needing to explicitly enable that stuff, and the posts I'm seeing on the internet are all due to unrelated issues.

If anyone has any ideas, I'd love some help!

Addendum: Getting rid of FreeRTOS

Just to eliminate stuff, I've removed the FreeRTOS project component, going back to a bare-metal project. In my main loop, I call MX_LWIP_Process(). This method should eliminate the need for interrupts, but it doesn't fix the problem; I'm still unable to receive frames. This makes me think there's something in the ETH HAL code generated by STM32CubeMX.

Solution

Just in case someone stumbles upon this question in the future, the problem turned out to be flipped RXD0 and RXD1 pins. This is why I was able to see traffic on my logic analyzer, but it wasn't decoded by my MCU.

As someone pointed out, the magnetics I used are asymmetric, and should not be used for auto-MDI-X. I haven't had any issues. I anticipate one of two things is happening: - the magnetics don't actually work in the other orientation, but because everything I have uses auto-MDI-X, my board essentially stays fixed in the configuration that works, while the other device on the cable orients its signals to match. - the magnetics provide suitable signal integrity given the short Ethernet runs, but a long-term analysis would show higher rates of packet dropping or problems over longer runs.

Honestly, it's not clear to me why it would matter on which side of the 1:1 transformer the line filters are installed, so outside of PoE applications, I'm not sure why a symmetric vs asymmetric design would matter.

Jay Carlson
  • 2,819
  • 1
  • 14
  • 21
  • Where Wireshark is installed? – Anonymous Jan 15 '18 at 09:13
  • The computer that the board was attempting to connect to. I will edit the question to add clarity to this. – Jay Carlson Jan 15 '18 at 09:24
  • FWIW, I would recommend using the FreeRTOS stack. (I realise this doesn't hep with your specific query.) I can't do anything until this evening, but if it helped, I am pretty sure I have a project for that processor that I got pings going with the FreeRTOS stack. I do not know which PHY was on the board I got up and running though. Anyway, let me know if you want the project, I can put it on the FreeRTOS interactive site. – DiBosco Jan 15 '18 at 10:00
  • That would be super helpful. I'm totally agnostic to the stack I'm using --- I just need something I can get up and running quickly. – Jay Carlson Jan 15 '18 at 18:49
  • Can you post the schematic for the hardware design? This seems like a firmware problem, but it would still be nice to eliminate any potential hardware questions. – youtooth Jan 16 '18 at 03:54
  • Schematic of the ethernet PHY posted. These pins go straight back to the Ethernet MAC on the STM32. CLKIN is generated from a 25 MHz oscillator on the STM32. I swapped TX and RX lanes, and swapped RX pins to make routing easier. This PHY supports auto-MDX and polarity-swapping, I believe, so I shouldn't have issues. Just to eliminate problems with that configuration, I built an Ethernet cable that undoes the RX swaps, but that doesn't result in different behavior. – Jay Carlson Jan 17 '18 at 05:35
  • Part number of jack/magnetics? – Anonymous Jan 17 '18 at 08:01
  • It's a generic part. I added the pin-out to the Hardware section of the question. – Jay Carlson Jan 17 '18 at 08:09
  • Only just seen this as you didn't reference me in your reply. Will try to remember to post it on the interactive forum tonight. – DiBosco Jan 17 '18 at 08:18
  • Ok, magnetics may have different internal circuitry, look here https://www.mouser.com/ds/2/336/-370307.pdf and you actually must use the one recommended by the manufacturer (in other words, having *supported configuration*). As far as I know, magnetics at least can be classified to symmetrical and asymmetrycal, symmetrical does readily support MDI/MDIX, asymmetrical does not. You have chosen the latter. See section 3.3 of the datasheet for the transceiver. Change your magnetics to J0011 as stated here http://ww1.microchip.com/downloads/en/DeviceDoc/EVB8720%20Evaluation%20Board%20Schematic.pdf. – Anonymous Jan 17 '18 at 08:52
  • 1
    I tried swapping magnetics with something symmetric, and that didn't fix the problem. However! I was eagle-eying my schematic and realized I had RXD0 and RXD1 swapped. Doh! That's why I saw RX data getting spit out of the PHY, but nothing received by the MAC. I might re-solder my old magnetics back on the board (just so I don't have something dangling off the table), and I feel like the auto-MDI-X protocol should get it figured out, right? The "link" LED should only illuminate when a valid RX/TX link is established, right? It was always illuminated, even with the old, asymmetric magnetics. – Jay Carlson Jan 18 '18 at 23:23
  • Excellent, hopefully it will be the fix. To my knowledge and experience, to have MDI/MDIX you have to use symmetric transformer. Section 3.3 of the datasheet confirms it. I do not know about its physics though, can not explain in detail. – Anonymous Jan 19 '18 at 10:52
  • Just to close the matter, I swapped the new magnetics back to the old one, and verified that the original one works fine. Auto-MDI-X is a pesudorandom process when both sides have auto-MDI-X capabilities, but so far, it seems like no matter how often I re-plug stuff, it works fine. – Jay Carlson Jan 19 '18 at 20:21
  • +1 from me for the excellently-written question with all the documentation, what you tried, what you're seeing and ideas you have. And a virtual +2 from me for editing the question to include the solution. I wish more people wrote questions like you do! – akohlsmith Feb 24 '18 at 15:43
  • your transformer *is* a 1:1 transformer for both pairs; swapping the TX and RX pairs won't make one whit of difference. The common mode choke is intended to go on the line side, but that's not what I consider an asymmetric transformer in this application. – akohlsmith Feb 24 '18 at 15:46

3 Answers3

1

Sorry to resurrect this topic. I couldn't pass without mentioning my experience.

I have used this HR911105A (RJ45 with magnet) with one of my projects.

HR911105A: enter image description here At a glance, one thing took my attention which was the connection between LAN8720 and RJ45 as per your schematic.

Since I see that the connection looks crossover. Although connected systems mostly use MDI-X and therefore detect Receive / Transmit pairs, It would be good to give it a less confusing connection like that:

LAN -> RJ45
=====================
TXP -> TD+ (Pin #1)
TXN -> TD- (Pin #2)
RXP -> RD+ (Pin #3)
RXN -> RD- (Pin #6)

Pin #4 and Pin #5 (so the 49.9R pull-up resistors) would be good if connected to 3V3_AN in your schematic while the other side should be coupled to the GND via a capacitor (0.1uF or 0.022uF).

Sener
  • 231
  • 5
  • 16
1

maybe that's your problem, you are cross D0 and D1.

enter image description here

Pox
  • 11
  • 1
0

You have wireshark installed on the PC, and as you say you use USB-to-LAN adapter. I am not sure at which physical point Wireshark captures packets in your setup, and thus it is a good question if outbound packets are actually appear on the physical network. I recommend you connect another PC with network interface, and see if these PCs can communicate to each other comparing output of Wiresharks on them.

Your wireshark output does not show any issues, PC announces three times that it is on local network and having IP address 10.0.0.1 (if it would receive reply to any of these 3 ARP requests then OS would pop up with IP address conflict).

Then, your board is constantly asking Who has 10.0.0.1? Tell 10.0.0.2 and PC replies with 10.0.0.1 is at .... The question is why it happens in loop:

  1. board does not physically receive response packet sent by PC;
  2. board expects something else, or packet received is corrupt, and it discards the packet.

Thus, as a next troubleshooting step, take another PC with "normal" Ethernet interface, install Wireshark on it, configure its networking the same way as you did for board, and try telnet 10.0.0.1 80 and see that pops up in Wiresharks on both machines. This way you would ensure that PC with its USB-to-Ethernet adapter works properly.

Your next steps will depend on things you see in these Wiresharks.

Update:

I'm receiving packets, otherwise the RXD0/D1 pins would show no activity, correct?

Not correct. You want to think that your board receives packets. You see there's some change in the level of the PHY input signals, but they do not necessarily represent valid packets. The fact that RX_ERR does not toggle does not immediately convince me that PHY is working properly on the incoming events, or information arriving makes up proper packets.

Anyway, it is up to you, my troubleshooting theory is simple - you must ensure at higher level where do you encounter the issue, and then dig into the respective part of the design. Digging into all the parts and suspecting everything is useless. It would be great luck if you find issue spreading the focus; you are already trying simplifying software, if it will not succeed you will most probably start replacing chips.

I do not think my troubleshooting step is so complicated to make it to ensure that another PC can communicate with PC with dongle and prove me wrong or right, and thus ensure that you are right digging into deeps suspecting board's PHY, MAC and software working on them.

Anonymous
  • 6,908
  • 1
  • 14
  • 41
  • While I appreciate you taking the time to write this, it's pretty clear that my PHY is receiving packets from my PC, yet they're not being received by my board. Otherwise, I wouldn't see the Rx data on the RMII lines, right? I don't think this is a simple, high-level networking question. – Jay Carlson Jan 15 '18 at 18:54
  • @JayCarlson You are still to prove that electrical signals at your board's cable end represent proper packets which can be captured and not discarded. Why going into deeps of technology without proving such simple things? – Anonymous Jan 15 '18 at 19:25
  • Is your theory that my computer isn't *actually* sending the packets that it should be sending (and that Wireshark says it's sending)? What are the packets my board is receiving, then? The board is connected directly to my computer. This isn't a complicated network setup, and any packet received by the PHY on my board has to originate from my computer, right? I'm receiving packets, otherwise the RXD0/D1 pins would show no activity, correct? Your hypothesis is that something is discarding packets, right? What is? The PHY? The RX_ERR bit never sets. The MCU's MAC? the receive ISR never fires. – Jay Carlson Jan 15 '18 at 22:56
  • I updated the answer. Do not be doubtful and preconceived. Complex things may appear simpler than you think. Just act and collect information. – Anonymous Jan 16 '18 at 07:58
  • 1
    Alright, I connected my computer to another using the same cable and USB-to-ethernet adapter. I ran an instance of Wireshark on both computers, and they show identical data --- some ARP chatter, and then a successful connection to a netcat service running on port 80. I've tested that both ways. I've tried connecting into that service from my embedded board, and as I said, never get past ARP messages. If I try to connect into the board from my computer, it doesn't get past the ARP stage, as the board never replies to my computer's ARP requests. I really don't think it's hearing packets. – Jay Carlson Jan 17 '18 at 06:03