Job Description
Location: Tulsa, OK
Description:
- Monitor SCOM for various systems and applications.
- Basic diagnostics of server connectivity
- Restart services/servers via provided tools as documented.
- Knowledge to engage correct SME team for service restoration.
- Monitor LAN/WAN health (OpsView)
- Troubleshoot local site issues vs. circuit/MPLS issues.
- Coordinate with carrier or Telecom and local contacts to restore services.
- Communicate information on a regular basis about outages and issues to IT.
- Ensure time-based communication of outage state and resolution are sent to appropriate segments of IT by severity.
- Remain informed of relevant outage situation to be able to provide insight as requested from other IT members and management.
- Knowledge of Windows Server OS (2012 R2 and newer)
- Experience troubleshooting Windows Server OS (2012 R2 and newer)
- Knowledge of vSphere.
- Knowledge of Networking Equipment such as Cisco, Aruba, Arista
- Experience troubleshooting Networking Equipment such as Cisco, Aruba, Arista
- Ability to work with various carriers and communicate findings internally and externally.
- Knowledge of PaloAlto.
- Develop e-mail communication for All Users about outages and other IT Information
- Monitor E-Mail to the Service Desk for break/fix issues.
- Provide technical support to callers with problems (afterhours)
All these tasks require:
- Ability to analyze alarm conditions as presented by multiple monitoring applications (OpsView, SCOM, FireEye, and others)
- 3+ years previous experience in a 24x7x365 IT Operations environment (preferred NOC or Infrastructure Support role)
- Experience with analyzing system and network performance using monitoring, historical, and graphical data.
- Experience with LAN and WAN internetworking devices like routers and switches (CCENT or better certification preferred)
- Experience with Windows Server 2012 R2 – 2022. Understanding of domain principles, DNS, and DHCP preferred.
- Experience troubleshooting PC software and hardware issues.
- Experience working critical issues that affect many users.
- Communicate information on a regular basis about system wide outages to various departments within the company.